STUDY OF Y-STR POLYMORPHISM: A COMPARISON 
BETWEEN UNRELATED INDIVIDUALS BELONGING TO 
KHANDAYAT AND NON-KHANDAYAT COMMUNITY OF 

ODISHA AND RANDOM INDIAN POPULATION 


THESIS SUBMITTED IN PARTIAL FULFILLMENT 
OF THE REQUIREMENTS FOR THE AWARD OF DEGREE OF 


DOCTOR OF PHILOSOPHY IN FORENSIC SCIENCES 


AMITY 


UNIVERSITY 


BY 
BISWA PRAKASH NAYAK 


AMITY INSTITUTE OF FORENSIC SCIENCES 
AMITY UNIVERSITY, UTTAR PRADESH, INDIA 


2014 


CERTIFICATE 


This is to certify that the thesis entitled, “STUDY OF Y-STR 
POLYMORPHISM: A COMPARISON BETWEEN UNRELATED 
INDIVIDUALS BELONGING TO KHANDAYAT & NON-KHANDAYAT 
COMMUNITY OF ODISHA & RANDOM INDIAN POPULATION” 
submitted to Amity Institute of Forensic Sciences (AIFS), Amity University, 
Uttar Pradesh in fulfillment of the requirement for the award of the degree of 
Doctor of Philosophy, embodies original research work carried out by Mr. Biswa 
Prakash Nayak. This work is original and has not been submitted so far in part or 


full for any other degree or diploma of any other university. 


Dr. Jyoti Singh Dr. Anupuma Raina 
Supervisor Co-supervisor 
Amity Inst of Forensic Sciences Dept. of Forensic Medicine & Toxicology 
Amity University, Uttar Pradesh AIIMS, New Delhi 


Dr. S. K. Shukla 


Professor & Director 
Amity Institute of Forensic Sciences (AIFS) 
Amity University, Uttar Pradesh 


ACKNOWLEDGEMENT 


I express my deepest gratitude to my supervisor, Dr. Jyoti Singh (Amity 
Institute of Forensic Sciences, Amity University, Uttar Pradesh) for providing 
invaluable feedback and always being very helpful and ready to discuss all 
sorts of issues and ideas. She has also offered me expert council when I 
needed it. It has been a great pleasure and truly rewarding to work with such a 
friendly and accomplished person. I am highly grateful to Dr. S. K. Shukla 
(Professor & Director, AIFS). His constant supervision coupled with a sense 
of concern helped me carrying out research in a way which did not make me 


feel burdened. 


Deep appreciation goes to my co-supervisor, Dr. Anupuma Raina (Sr. 
Scientist, Dept. of Forensic Medicine & Toxicology, AIMS, New Delhi). Dr. 
Raina has expertly guided me through my research from the day I started 
working with her. The free atmosphere and independence provided by her in 
the laboratory enabled me to accept new challenges, and improved my skills 
both mentally and scientifically. I feel myself very lucky to have her as my 
supervisor, and I thank her from my heart for making me a confident Research 
Scholar. I would like to thank Prof. T. D. Dogra (former Director, AIIMS, 
New Delhi) for permitting me to work in the DNA Fingerprinting Lab. 


My gratitude also goes to all the staff, who made working in the Department 


of Forensic Medicine & Toxiclogy, AIIMS such a pleasant experience. 


I would also like to thank my colleagues from the AIFS, especially Himanshu 


Khajuria, for thorough discussion and proof-reading this thesis. 


I thank my parents (Ashalata Nayak and Laxmidhar Nayak) for their never- 
fading love, encouragement, blessings, and for guiding me on the right track. 
Finally, I would like to thank my wife Sapna for her undying patience, love 


and support over the years - and those to come. 


I would like to express my deep sense of gratitude to all those who have 
directly or indirectly helped me in successful completion of the present work. 
I render my sincere thanks all my friends who have been a pillar of support 
and have stood by me, encouraged and supported me throughout my difficult 


times. 


I would like to thank Dr. Ashok K. Chauhan (Founder President, Amity 
Group of Institutions & Founder, Ritnand Balved Education Foundation) for 
providing me an opportunity to complete my Ph.D. at AUUP. I thank 
Chancellor, Vice-Chancellor, Pro Vice-Chancellor, and members of DRC for 


their support and recommendations. 


Biswa Prakash Nayak 


il 


CONTENTS 


INTRODUCTION 
LITERATURE REVIEW 
METHODOLOGY 
RESULT & DISCUSSION 
CONCLUSION 
BIBLIOGRAPHY 
PUBLICATIONS 
APPENDICES 


100 
132 
133 


Introduction 


Who we are? Where do we come from? 
These are some of the most baffling questions of the genomics era. Probably 


the answer to these, lie in our genome itself. 


Human individualization based on the genome exploits the fact that everyone 
except for identical twins is genetically distinguishable. Moreover, genetic 
material is found in every nucleated cell in the body and can be recovered 
from samples as diverse as bone, blood stains, saliva residues, nasal 
secretions, and even fingerprints (Hoff-Olson et al., 1999; Schiffner et al., 
2005; Wurmb-Schwark et al., 2006). DNA may be recovered from very old 
samples that have been well preserved (Weir, 2001). Over the past 60 years, 
DNA has arisen from being an obscure molecule with presumed accessory or 
structural functions inside the nucleus to the icon of modern bioscience 
(Alberts et al., 2002; Primrose & Twyman, 2003). New tools of molecular 
biology have enabled forensic scientists to characterize biological evidence at 
the DNA level (James & Nordby, 2005; Thompson & Black, 2007). The 
ability to type DNA from biological evidence is one of the most important 
developments in forensic science. DNA technology affords the forensic 
scientist the ability to eliminate individuals who have been falsely associated 
with a biological sample and to reduce the number of potential contributors to 
a few (if not one) individuals (Gardner et al., 2002; Watson et al., 2004). The 
technology today includes number of genetic markers, a variety of valid DNA 
typing strategies, and analytical software. All of which make developing DNA 
profiles and searching DNA databanks relatively rapid and facile (Butler, 
2012). Since cases can be analyzed more rapidly and DNA databanks can be 
generated more rapidly than a decade ago, DNA data-banking can be 
established and used to search DNA profiles/records to help resolve a number 


of violent crimes. 


Introduction 


1.1 DNA - THE BLUEPRINT OF LIFE 


1869 was a landmark year in genetic research; because it was the year in 
which Swiss physician Friedrich Miescher first identified nucleic acid or 
DNA, what he named as “nuclein” (Brown, 2002; Wolf, 2003; Hedrick, 2005; 
Dahm, 2008). Russian biochemist Phoebus Levene was the first to discover 
the order of the three major components of a single nucleotide (phosphate- 
sugar-base), the carbohydrate component of RNA (ribose) and DNA 
(deoxyribose) (Levene, 1919). Erwin Chargaff was one of a handful of 
scientists who expanded on Levene's work by uncovering additional details 
about the structure of DNA (Kendall & Osterberg, 1919; Hershey & Chase, 
1952; Chargaff, 1950, 1971). The story of DNA often seems to begin in 1944 
with Avery, MacLeod, and McCarty showing that DNA is the hereditary 
material (Avery et al., 1944; McCarty, 1994). Without the scientific 
foundation provided by these pioneers, Watson and Crick may never have 
reached their groundbreaking conclusion of 1953: that the DNA molecule 
exists in the form of a three-dimensional double helix (Wilkins et al., 1951, 
1953; Watson & Crick, 1953; Franklin & Gosling, 1953; Pauling & Corey, 
1953; Rich & Watson, 1954). Within another decade, the complexity of 
genetic code was cracked (Nirenberg et al., 1963, 1966; Lederberg, 1994; 
Hartl & Clark, 2006; Lewin, 2007; Gann & Witkowski, 2012). 


DNA is present in the nucleus along with histone proteins in the form of 
highly coiled structure called as chromosomes (Painter, 1921, 1923). The 
haploid human genome contains approximately 3 billion base pairs of DNA 
packaged into 23 chromosomes (Venter eft al., 2001). The correct 
determination of the human diploid chromosome number as 46, by J-H Tjio 


and A Levan, at the University of Lund, Sweden, occurred 50 years ago, in 


Introduction 


December 1955 (Hsu, 1952; Tjio & Levan, 1956; Speicher et al., 1996; Trask, 
2002; Gartler, 2006). Structure of chromosome and packaging of DNA were 
first enunciated by Thomas & Kornberg (1975). Humans inherit one set of 
chromosomes from their mother and a second set from their father. In total, 
most human cells contain 46 chromosomes with 22 pairs of autosomes, or 
non-sex chromosomes, and two sex-determining chromosomes. The sex 
chromosomes in humans are called X and Y. Females carry two X 
chromosomes, while males carry one X and one Y chromosome (Ford & 
Hamerton, 1956). The Y chromosome has often been used as a marker for 
studying human demographic history (Painter, 1923; Stern, 1957; Jobling & 
Tyler-Smith, 2003). The Y chromosome does not undergo homologous 
recombination, except in the small pseudoautosomal regions (Whitfield et al., 


1995; Pritchard et al., 1999; Mangs & Morris, 2007; Devlin, 2010). 


} Noa it 


bs 2B OA e¢ i 


13 20 21 22 X ¥: 


Figure 1.1: Karyotype of normal human male (Hsu, 1952). 


Introduction 


1.2 DNA POLYMORPHISM 


The human genome is composed of approximately 3 billion base pairs 
organized into an estimated 30,000 genes. The human genome has many 
repeated sequences (Britten & Kohne, 1968; Armour ef a/., 1989; Venter et 
al., 2001; Ellegren, 2004). Tandem repeats are an array of consecutive repeats. 
They include 3 sub-classes: satellites, minisatellites and microsatellites. The 
name satellite comes from their optical spectra. By using buoyant density 
gradient centrifugation, DNA fragments with significantly different base 
compositions may be separated, and then monitored by the absorption spectra 
of ultra-violet light. The main band represents the bulk DNA, and the 
“satellite” bands originate from tandem repeats (Britten & Kohne, 1968; 
Housman, 1995; Collins et al., 1998; Bailey et al., 2002; Primrose & 
Twyman, 2003; Cooper, 2006). The term “polymorphism” describes the 
existence of different forms within a population, e.g., difference in the number 
of tandem repeats. All tandem repeat polymorphisms could result from DNA 
recombination during meiosis (Biemont & Vieira, 2006). The microsatellite 


polymorphism could also be caused by replication slippage (Cooper, 2006). 


1.2.1 SATELLITES 

The size of satellite DNA ranges from 100kbp to over 1Mbp in humans, a 
well-known example is the alphoid DNA located at the centromere of all the 
chromosomes. Its repeat unit is 171 bp and the repetitive region accounts for 
3-5% of the DNA in each chromosome (Britten & Kohne, 1968; Goodbourn 
et al., 1983; Gomolka et al., 1994). Other satellites have a shorter repeat unit. 


Most satellites in humans or in other organisms are located at the centromere. 


Introduction 


1.2.2 MINISATELLITES 

The size of a minisatllite ranges from | kbp to 20 kbp. Minisatellites are also 
known as variable number of tandem repeats (VNTR). Its repeat unit ranges 
from 9 bp to 80 bp (Bell et a/., 1982; Armour ef al., 1989; Horn et al., 1989). 
They are located in non-coding regions. The number of repeats for a given 
minisatellite may differ between individuals. This feature is the basis of DNA 
fingerprinting (Jeffreys et al., 1985; O'Connell et al., 1988; Kasai et al., 
1990). Another type of minisatellites is the telomere. In a human germ cell, 
the size of a telomere is above 15 kb. In aging somatic cell, the telomere is 
shorter. The telomere contains tandemly repeated sequence GGGTTA (Wong 
et al., 1987; Tautz, 1989). Another example of a VNTR is the forensic DNA 
marker D1S80. The D1S80 marker is a minisatellite with a 16bp repeat unit 


and contains alleles in the range of 16-41 repeats (Budowle et al., 1991). 


1.2.3 MICROSATELLITES 

Microsatellites are also known as short tandem repeats (STRs), because a 
repeat unit consists of only 2 to 7 bp and whole repetitive region spans less 
than 150 bp. STRs were first reported in the late 1980s and the number of 
repeats for a given microsatellite may differ between individuals (Weber et 
al., 1989; Watson et al., 2004). Therefore, microsatellites can also be used for 
DNA fingerprinting. In addition, both microsatellites and minisatellites 
patterns can provide information about paternity (Edwards ef al., 1991; 
Collins et al., 2003; Urquhart et a/., 1993). The most famous case is President 
Thomas Jefferson and his alleged sons (Foster et al., 1998). One of the 
greatest mysteries for most of the twentieth century was the fate of the 
Romanov family, the last Russian monarchy and the case was solved by 
combined analysis of autosomal and Y-chromosomal STRs (Coble et al., 


2009). 


Introduction 


“Whenever you have excluded the impossible, whatever remains, however 
improbable, must be the Truth.” 

- Sir Arthur Conan Doyle 
1.3 DNA FINGERPINTING 


The process of “DNA fingerprinting’ or DNA profiling was first described in 
1985 by an English geneticist named Alec Jeffreys. The human genome is full 
of repeated DNA sequences. These repeated sequences come in various sizes 
and are classified according to the length of the core repeat units, the number 
of contiguous repeat units, and/or the overall length of the repeat region 
(Housman, 1995). The number of repeated sections present in a sample could 
differ from individual to individual. Sir Alec Jeffreys developed a technique to 
examine the length variation of these DNA repeated sequences to perform 
human identity tests (Jeffreys et al., 1985, 1986; Wong et al., 1986, 1987). 
These repeated DNA sequences are known as VNTRs (variable number of 
tandem repeats). The technique used by Dr. Jeffreys to examine the VNTRs 
was called restriction fragment length polymorphism (RFLP) because it 
involved the use of a restriction enzyme (Meselson & Yuan, 1968) to cut the 
regions of DNA surrounding the VNTRs. This RFLP method was first used to 
help in an English immigration case and shortly thereafter to solve a double 
homicide case in UK (Jeffreys et al., 1985). Since that time, human identity 
testing using DNA typing methods has been widespread (Marroni, 2001; 
Hood & Galas, 2003). 


Any material that contains nucleated cells, including blood, semen, saliva, 
hair, bones, and teeth, potentially can be typed for DNA polymorphisms 
(Higuchi et al., 1988; Kasai et al., 1990; Walsh et al., 1991; Hochmeister et 
al., 1991). The typing of VNTR loci by RFLP analysis is the most 


Introduction 


discriminating, or individualizing, molecular biology technology for forensic 
identity testing. Although this approach is valid and reliable for forensic and 
paternity testing, it has certain limitations. These include: 

1) Sufficient quantity of high molecular weight DNA (usually at least 50 ng) 
is required for RFLP analysis. 

2) Samples that have been substantially degraded cannot be analyzed by 
RFLP typing. 


3) RFLP analysis is laborious as well as time-consuming. 


An alternative strategy for forensic DNA typing is the use of STRs through 
PCR-based assays. DNA regions with short repeat units (usually 2 to 6 bp in 
length) are called short tandem repeats (STR). STRs have proven to have 
several benefits that make them especially suitable for human identification. 
Compared with the RFLP approach, the advantages PCR-based technologies 
include augmented sensitivity and specificity and decreased assay time and 
labor. Also, many degraded DNA samples can be amplified by PCR and 
subsequently typed because amplified alleles generally are much smaller in 
size compared with alleles detected by RFLP analysis. These features make 
PCR a particularly useful tool for analyzing biological material found at crime 


scenes (Comey & Budowle, 1991; Reynolds et al., 1991). 


STRs have become popular DNA repeat markers because they are easily 
amplified by the polymerase chain reaction without the problems of 
differential amplification. This is due to the fact that both alleles from a 
heterozygous individual are similar in size since the repeat size is small (Dib 
et al., 1996). The number of repeats in STR markers can be highly variable 
among individuals, which make these STRs effective for human identification 


purposes (Bar et al., 1997; Thompson & Black, 2007). 


Introduction 


1.4 PCR-BASED GENETIC MARKERS (STRs) 


PCR (Polymerase chain reaction) has revolutionized molecular biology with 
the ability to make hundreds of millions of copies of a specific sequence of 
DNA in a matter of only a few hours (Mullis et al., 1986; Mullis et al., 1987). 
Without the ability to make copies of DNA molecules, many forensic samples 
would be impossible to analyze. DNA from crime scenes is often limited in 
both quantity and quality and obtaining a cleaner, more concentrated sample is 
normally out of the question. The PCR DNA amplification technology is well 
suited to analysis of forensic DNA samples because it is sensitive, rapid, and 
not limited by the quality of the DNA as are the restriction fragment length 
polymorphism (RFLP) methods (Jeffreys et al., 1985; Jeffreys et al., 1986; 
Sajantila et al., 1992). 


PCR permits more than one region of DNA to be copied simultaneously by 
simply adding more than one primer set to the reaction mixture. The 
simultaneous amplification of two or more regions of DNA is commonly 
known as multiplexing or multiplex PCR (Bosch et al., 2002). For a multiplex 
reaction to work properly, the primer pairs need to be compatible. In other 
words, the primer annealing temperatures should be similar and excessive 
regions of complementarities should be avoided to prevent the formation of 
primer dimers that will cause the primers to bind to one another instead of the 
template DNA. The addition of each new primer in a multiplex PCR reaction 
exponentially increases the complexity of possible primer interactions. 
Multiplex PCR technique is extensively being used in STR-based DNA 
fingerprinting techniques (Butler et al., 2001; Butler et al., 2002; Schoske et 
al., 2003). Considerable time and effort can be saved by simultaneously 


amplifying multiple sequences in a single reaction, a process referred to as 


Introduction 


multiplex polymerase chain reaction (PCR). Multiplex PCR requires that 
primers lead to amplification of unique regions of DNA, both in individual 
pairs and in combinations of many primers, under a single set of reaction 
conditions. In addition, methods must be available for the analysis of each 
individual amplification product from the mixture of all the products. 
Multiplex PCR is becoming a rapid and convenient screening assay in both 
the clinical and the research laboratory. The development of an efficient 
multiplex PCR usually requires strategic planning and multiple attempts to 
optimize reaction conditions. For a successful multiplex PCR assay, the 
relative concentration of the primers, concentration of the PCR buffer, balance 
between the magnesium chloride and deoxynucleotide concentrations, cycling 
temperatures, and amount of template DNA and Taq DNA polymerase are 
important. An optimal combination of annealing temperature and buffer 
concentration is essential in multiplex PCR to obtain highly specific 
amplification products (D'Aquila et al, 1991). Magnesium chloride 
concentration needs only to be proportional to the amount of dNTP, while 
adjusting primer concentration for each target sequence is also essential. The 
list of various factors that can influence the reaction is by no means complete. 
Optimization of the parameters discussed in the present review should provide 
a practical approach toward resolving the common problems encountered in 
multiplex PCR (such as spurious amplification products, uneven or no 
amplification of some target sequences, and difficulties in reproducing some 
results). Thorough evaluation and validation of new multiplex PCR 
procedures is essential. The sensitivity and specificity must be thoroughly 
evaluated using standardized purified nucleic acids (Mullis et al., 1986; Mullis 


et al., 1987; Markoulatos et al., 2002). 


Introduction 


For human identification purposes, it is important to have DNA markers that 
exhibit the highest possible variation in order to discriminate between samples 
(Hammond et al., 1994). The smaller size of STR alleles makes STR marker 
better candidates for use in forensic applications, in which degraded DNA is 
common. PCR amplification of degraded DNA samples can be better 
accomplished with smaller target product sizes. Because of their smaller size, 
STR alleles can also be separated from other chromosomal locations more 
easily to ensure closely linked loci are not chosen. Closely linked loci do not 
follow the predictable pattern of random distribution in the population, 
making statistical analysis difficult. STR alleles also have lower mutation 
rates, which make the data more stable and predictable. Because of these 
characteristics, STRs with higher power of discrimination are chosen for 
human identification in forensic cases on a regular basis. It is used to identify 
victim, perpetrator, missing persons, and personal identification in case of 


mass disaster (Butler et al., 2001; Butler et al., 2002). 


1.4.1 TYPES OF STR MARKERS 

STR repeat sequences are named by the length of the repeat unit. Dinucleotide 
repeats have two nucleotides repeated next to each other over and over again. 
Trinucleotides have three nucleotides in the repeat unit, tetranucleotides have 
four, pentanucleotides have five, and hexanucleotides have six repeat units in 
the core repeat. Tetranucleotide repeats have become the most popular STR 
markers for human identification. STR sequences not only vary in the length 
of the repeat unit and the number of repeats but also in the rigor with which 
they conform to an incremental repeat pattern (Tautz et al., 1993; Urquhart e¢ 
al., 1994). STRs are often divided into several categories based on the repeat 
pattern. Simple repeats contain units of identical length and sequence, 


compound repeats comprise two or more adjacent simple repeats, and 


10 


Introduction 


complex repeats may contain several repeat blocks of variable unit length as 
well as variable intervening sequences. Complex hypervariable repeats also 
exist with numerous non-consensus alleles that differ in both size and 
sequence and are therefore challenging to genotype reproducibly. This last 
category of STR markers is not commonly used in forensic DNA typing due 
to difficulties with allele nomenclature and measurement variability between 


laboratories (Butler, 2001, 2005). 


Among the various types of STR systems, tetranucleotide repeats have 
become more popular than di- or trinucleotides. Penta- and hexanucleotide 
repeats are less common in the human genome but are being examined by 
some laboratories. STR product amounts vary depending on the STR locus but 
are usually less than 15% of the allele product quantity with tetranucleotide 
repeats. With di- and trinucleotides, the stutter percentage can be much greater 
(30% or more), making it difficult to interpret sample mixtures. In addition, 
the four-base spread in alleles with tetranucleotides makes closely spaced 
heterozygotes easier to resolve with size-based electrophoretic separations 
compared to alleles that could be two or three bases different in size with 


dinucleotides and trinucleotide markers, respectively (Butler, 2010). 


1.4.2.1 Autosomal Short Tandem Repeats 

For DNA typing markers to be effective across a wide number of 
jurisdictions, a common set of standardized markers must be used. The STR 
loci that are commonly used today were initially characterized and developed 
at the Baylor College of Medicine, England (Edwards et al., 1991; Puers et 
al., 1993; Hammond et al., 1994). The Promega Corporation (Madison, WI) 
initially commercialized many markers, while Applied Biosystems (Foster 


City, CA) incorporated some new markers. The STR project beginning in 


11 


Introduction 


April 1996 and concluding in November 1997 involved 22 DNA typing 
laboratories and the evaluation of 17 candidate STR loci. The evaluated STR 
loci were CSF1IPO, F13A01, F13B, FES/FPS, FGA, LPL, TH01, TPOX, 
VWA, D3S1358, D5S818, D7S820, D8S1179, D13S317, D16S539, D18S51, 
and D21S11. Details of some commonly used autosomal STRs have been 
discussed in Table1.1. 


Table 1.1: Autosomal STR loci (Butler, 2006) 


Locus Chromosomal Physical Category & Allele 
Location Position Repeat Motif | Range 
TPOX 2p25.3 thyroid Chr 2 1.472 | Simple 4-16 
peroxidase, 10th Mb GAAT 
intron 
D2S1338 | 2q35 Chr 2 Compound 15 — 28 
218.705 Mb | TGCC/TTCC 
D3S81358 | 3p21.31 Chr 3 45.557 | Compound 8-21 
Mb TCTG/TCTA 
FGA 4q31.3 alpha Chr 4 Compound 12.2- 
fi brinogen, 3rd 155.866 Mb | CTTT/TTCC | 51.2 
intron 
DS5S818 | 5q23.2 Chr 5 Simple 7-18 
123.139 Mb_ | AGAT 
CSFIPO | 5q33.1 c-fms Chr 5 Simple 5-16 
proto-oncogene, 149.436 Mb | TAGA 
6th intron 
SE33 6q14 beta actin- Chr 6 89.043 | Complex 4.2- 
Related Mb AAAG 37 
pseudogene 
D7S820 | 7q21.11 Chr 7 83.433 | Simple 5-16 
Mb GATA 
D8S1179 | 8q24.13 Chr 8 Compound 7-20 
125.976 Mb__| TCTA/TCTG 
THO1 11p15.5 tyrosine Chr 11 2.149 | Simple 3-44 
hydroxylase, Ist Mb TCAT 
intron 
VWA 12p13.31 von Chr 12 5.963 | Compound 10-25 
Willebrand factor, | Mb TCTG/TCTA 
40th intron 


12 


Introduction 


D138317 | 13q31.1 Chr 13 Simple 5-16 
81.620 Mb TATC 

PentaE | 15q26.2 Chr 15 Simple 5 —24 
95.175 Mb AAAGA 

D16S539 | 16q24.1 Chr. 16 Simple 5-16 
84.944 Mb GATA 

D18S51_ | 18q21.33 Chr 18 Simple 7-40 
59.100 Mb AGAA 

D19S433 | 19q12 Chr 19 Compound 9- 
35.109 Mb AAGG/TAGG | 17.2 

D21S11 | 21q21.1 Chr 21 Complex 12 - 
19.476 Mb TCTA/ TCTG | 41.2 

PentaD | 21q22.3 Chr 21 Simple 2d = 
43.880 Mb AAAGA 17 


1.4.2.2 Y-Chromosomal Short Tandem Repeats (Y-STRs) 

Y-STRs are short tandem repeats (STRs) found on the male specific Y- 
chromosome. The coding genes, mostly found on the short arm of the Y- 
chromosome, are vital to male sex determination, spermatogenesis and other 
male related functions (Chakraborty, 1985; Jobling & Tyler-Smith, 2003; 
Elhaik, 2014). The Y-STRs are polymorphic among unrelated males and are 
inherited through the paternal line with little change through generations 


(Malaspina et al., 1990; Hurles & Jobling, 2001). 


Y-STRs have been used by forensic laboratories to examine sexual assault 
evidence. In a sexual assault case, evidence such as vaginal swabs will contain 
both female and male DNA. Differential extraction is often used to separate 
the male component from the female component. More often, however, the 
male and female components cannot be separated completely (Cerri et al., 
2003). As a result, the female component could exist prominently even in the 


male component after separation. When the “male DNA sample” undergoes 


19 


Introduction 


the PCR amplification process, the female DNA component is amplified as 
well, sometimes masking the male DNA, which makes analysis difficult 


(Jobling et al., 1997; de Knijff, 2000; Corach et al., 2001). 


Masking does not occur when Y-STRs are examined. Since there is no Y-STR 
in the female evidence, the only contribution of Y-STR can only come from 
the assailant(s) in a sexual assault case (Corach et al., 2001). The male 
component will be easily detected, since only this part of DNA will be 
amplified. The Y-STR system is especially helpful when there is more than 
one assailant. The mixed pattern in the evidence can help to identify those 
males responsible for the assault. Y-STR is also used for non-sexual assault 
cases where mixed samples are collected from evidence. Sometimes, regular 
STR will cause the masking effect if there is a very small quantity of male 
DNA in the mixed sample. Performing Y-STR testing can help to identify all 
males who have contributed to the evidence (Mathias et al., 1994; de Knijff et 
al., 1997; Ballantyne et al., 2010). 


In 1992, Lutz Roewer and colleagues described the first polymorphic Y- 
chromosome marker Y-27H39 - now better known as the STR locus DYS19. 
For the next ten years, discovery of polymorphic tandem repeat markers on 
the Y-chromosome progressed much more slowly than for their autosomal 
counterparts. Only 30 markers were available to researchers in year 2002. But 
in the last decade, 200 new STR markers have been uncovered due to 
extensive research on Y-chromosome (Lahn et al., 2001). The rapid growth in 
the discovery of new Y-STR markers is a direct result of the availability of 
DNA sequence information from the Human Genome Project and improved 
bioinformatics tools for searching DNA sequence databases (Lander ef al., 


2001). 


14 


Introduction 


In 1997, the European forensic community settled on a core set of Y-STR 
markers or “minimal haplotype” that includes DYS19, DYS389I/I, DYS390, 
DYS391, DYS392, DYS393, and DYS385 a/b with YCAITI a/b as an optional 
marker to create an “extended haplotype” . Most Y-chromosome data to date 
has been generated with these loci. In early 2003, the U.S. Scientific Working 
Group on DNA Analysis Methods (SWGDAM) selected a core set of markers 
that includes the 9 markers in the minimal haplotype plus DYS438 and 
DYS439 (Bar et al., 1997; Beleza et al., 2003; Butler et al., 2008). 


Hate rea 
Pecunia Pau aah pec 
require region 


Figure 1.2: Position of various Y-STRs on Y-chromosome (Redd et al., 2002) 


Ie) 


Introduction 


In forensic science, Y-chromosome is analyzed for following purposes: 

(1) Forensic casework on sexual assault evidence -— male-specific 
amplification can be done through Y-STR analysis, which can avoid 
differential extraction to separate sperm and epithelial cells 

(2) Paternity testing — male children can be tied to fathers in motherless 
paternity cases. 

(3) Missing person’s investigations — patrilineal male relatives may be used 
for reference samples. 

(4) Human migration and evolutionary studies — lack of recombination 
enables comparison of male individuals separated by large periods of time. 

(5) Historical and genealogical research — surnames usually retained by males; 


can make links where paper trail is limited. 


1.4.2 SEPARATION AND DETECTION OF PCR PRODUCTS 

A polymerase chain reaction (PCR), in which short tandem repeat (STR) 
alleles are amplified produces a mixture of DNA molecules that present a 
challenging separation problem. A multiplex PCR can produce 20 or more 
different sized DNA fragments representing different alleles that must be 
resolved from one another. The separation is typically performed by a process 
known as electrophoresis. PCR products from short tandem repeat DNA must 
be separated in a fashion that allows each allele to be distinguished from other 
alleles. Heterozygous alleles are resolved in this manner with a size-based 
separation method known as electrophoresis. The separation medium may be 
in the form of a slab gel or a capillary (Allen et al., 1989). Capillary 
electrophoresis (CE) is a relatively new addition to the electrophoresis family. 
The first CE separations of DNA were performed in the late 1980s (Maxam & 
Gilbert, 1977; Sanger et al., 1977). Since the introduction of new CE 


instrumentation in the mid-1990s, the technique has gained rapidly in 


16 


Introduction 


popularity for routine forensic analyses. While slab-gel electrophoresis has 
been a proven technique for over 40 years, there are a number of advantages 
to analyzing DNA in a capillary format. First and foremost, the injection, 
separation, and detection steps can be fully automated, permitting multiple 
samples to be run unattended by CE. In addition, only tiny quantities of 
sample are consumed in the injection process, leaving enough samples to be 
easily retested if needed. This is an important advantage for precious forensic 
specimens that often cannot be easily replaced. Separation in capillaries may 
be conducted in minutes rather than hours due to higher voltages that are 
permitted with improved heat dissipation from capillaries. Another advantage 
is that CE instruments are designed such that quantitative information is 
readily available in an electronic format following the completion of a run. No 
extra steps such as scanning the gel or taking a picture of it are required 


(McCord et al., 1993; Butler et al., 1994). 


Over the years a number of methods have been used for detecting DNA 
molecules following electrophoretic separation. Early techniques involved 
radioactive labels and autoradiography. These methods were sensitive and 
effective but time consuming. In addition, the use of radioisotopes was 
expensive due to the need for photographic films and supplies and the 
extensive requirements surrounding the handling and disposal of radioactive 
materials. Since the late 1980s, methods such as silver staining and 
fluorescence techniques have gained in popularity for detecting STR alleles 
due to their low cost, in the case of silver staining, and their capability of 
automating the detection, in the case of fluorescence (Livak et al., 1995; 
Butler et al., 2001). The first capillary electrophoresis (CE) separations of 
short tandem repeat (STR) alleles were performed in late 1992 using 


nondenaturing conditions with the polymerase chain reaction (PCR) products 


17 


Introduction 


in a double-stranded form. Fluorescent intercalating dyes were used to 
visualize the DNA with laser-induced fluorescence detection and to promote 
the resolution of closely spaced alleles. Internal standards were used to 
bracket the alleles in order to perform accurate STR genotyping. An allelic 
ladder was first run with the internal standards to calibrate the DNA migration 
times followed by analysis of the samples with the same internal standards. 
Now, fluorescence-based detection assays are widely used in forensic 
laboratories due to their capabilities for multicolor analysis as well as rapid 
and easy-to use formats. In the application to DNA typing with STR markers, 
the fluorescent dye is attached to a PCR primer that is incorporated into the 
amplified target region of DNA. Amplified STR alleles are visualized as 
bands on a gel or represented by peaks on an electropherogram. A 
fluorescence detector is a photosensitive device that measures the light 
intensity emitted from a fluorophore. Detection of low-intensity light may be 
accomplished with a photomultiplier tube (PMT) or a charge-coupled device 
(CCD). The action of a photon striking the detector is converted to an electric 
signal. The strength of the resultant current is proportional to the intensity of 
the emitted light. This light intensity is typically reported in arbitrary units, 
such as relative fluorescence units (RFUs). Multi-component spectral analysis 
is performed by testing a standard set of DNA fragments labeled with each 
individual dye. Computer software provided with the CE instrument then 
analyzes the data from each of the dyes. Use of different colored fluorescent 
dyes has made it possible to analyze different STR loci simultaneously, each 
with its own color label (Gill et a/., 2001). The computer software enables the 
raw data to be converted into fragment size which is ultimately matched for 


number of STRs in DNA samples. 


18 


Introduction 


1.5 INDIAN POPULATION DIVERSITY 


India is a country with enormous social and cultural diversity due to its 
positioning on the crossroads of many historic and pre-historic human 
migrations. The hierarchical caste system in the Hindu society dominates the 
social structure of the Indian populations (Bhasin & Walter, 2001). The origin 
of the caste system in India is a matter of debate with many linguists and 
anthropologists suggesting that it began with the arrival of Indo-European 
speakers from Central Asia about 3500 years ago. Previous genetic studies 
based on Indian populations failed to achieve a consensus in this regard 
(Thanseem et al., 2006). Indian populations are classified into various caste, 
tribe and religious groups, which altogether makes them very unique 
compared to rest of the world. India is considered as a treasure for the 
geneticists and evolutionary scholars as it is conglomerated with 4,635 
anthropologically well-defined populations, among which 532 are tribes, 
including 72 primitive tribes (36 hunters and gatherers). They differ from each 
other with respect to their language, social structure, dress and food habits, 
marriage practices, physical appearance and genetic architecture. India 
harbours a variety of geographical realms that give refuge to diverse humans 
and a verity of microbes, plants and animals. In India, four major language 
families are spoken such as Indo-European, Dravidian, Austro-Asiatic, and 
Tibeto-Burman. In addition, India has enigmatic Andaman-Nicobar Islanders, 
whom we predicted as the descendants of early group of modern humans 
(Papiha, 1996; Thangaraj et al., 2003; Tamang & Thangaraj, 2003; Tamang et 
al., 2012). 


19 


Introduction 


Figure 1.3: Possible routes of modern human migrations to Indian 


subcontinent (Tamang et al., 2012) 


1.5.1 SOCIAL STRATIFICATION AMONG HINDU SOCIETY 

The caste system has persisted in Indian Hindu society for around 3,500 years. 
Like the Y chromosome, caste is defined at birth, and males cannot change 
their caste (Zerjal et al., 2007). The caste system was a typical feature of the 
Hindu society and it divided Hindus into four categories viz. Brahmins, 
Kshatriyas, Vaishyas and Sudras. Brahmins were primarily involved in 
teaching and performing rituals, Kshatriyas were rulers and defended the 
territory, Vaishyas were businessmen and the Sudras served as the labourers. 
Further, each caste is subdivided into subcastes and subcastes into multiple 
Gotras. The caste system became the governing factor of all socio-religious 


and economic activities of people. The tribes remained isolated from the other 


20 


Introduction 


groups and occupied relatively remote places. Several religious communities 
build up in mainland India during the course of time due to several waves of 
migrations from different directions. The rise of the majority of religious 


groups was basically due to cultural adaptations (Majumder, 1998). 


1.5.2 POPULATIONS OF ODISHA 

Odisha is located on the southeast coastal region of India. Endowed with 
nature’s bounty, a 482km stretch of coastline with virgin beaches, serpentine 
rivers, mighty waterfalls, forest-clad blue hills of Eastern Ghats with rich wild 
life Odisha is dotted with exquisite temples, historic monuments as well as 
pieces of modern engineering feat. Odisha, with a rich heritage that is more 
than two thousand years old, has a glorious history of its own. It was known 
under different names in different periods: Kalinga, Utkal, Odradesha or 
Orissa. Seaports flourished along the coast as early as the 4th and Sth 
centuries B.C., when the sadhabs, the Odishan seafaring merchants, went to 
the islands of Java, Sumatra, Borneo and Bali with their merchandise. Not 
only did they bring home wealth and prosperity, they also carried the glorious 
Indian civilizations with them and helped its spread abroad. Odisha has a 
population of about three crore (Census of India, 2011). Odisha is inhabited 
by various population groups belonging to different strata of the hierarchical 
caste system. The non-tribal populations of Odisha belong to four different 
castes like, Brahmins, Khandayats, Karans and Gope. Khandayats belong to 
an ancient warrior group also known as Kshatriya and constitute over 30% of 


the state’s population. 


21 


Introduction 


1.5.3 Y-CHROMOSOME BASED POPULATION STUDIES 

Y-chromosome markers (STRs and SNPs) are located in the non-recombining 
region of Y-chromosome (NRY) and can preserve paternal history. Very 
recently, there has been an increasing use of hundreds of thousands of 
autosomal SNPs to deduce population structure (Reich et al., 1991; Chaubey 
et al., 2011; Saha et al., 2003; Sahoo et al., 2006). Y-chromosome and 
mtDNA markers have been extensively used to infer peopling of different 
continents/countries and to trace the maternal and paternal lineages of 
different populations (Hammer, 1995; Bamshad ef a/. 2001; Kivisild et al., 
2003; Thangaraj et al., 2003; Bamshad & Wooding, 2003; Basu et al., 2003; 
Thanseem et al., 2006; Thangaraj et al., 2007). The most accepted model for 
human origin and migration is known as out-of-Africa’, which suggests origin 
of modern human in Africa and subsequent migration and expansion to 
different continents; through southern coastal route during 60,000 to 85,000 
ybp (Thangaraj et al., 2007). The southern coastal route hypothesis is based 
on a fact that a small group of modern human on crossing fertile crescent 
entered India followed by their entry to southeast Asia and subsequently 
(50,000 to 60,000 ybp) to Australia and rest of the world (Thangaraj et al., 
2003; Macaulay et al., 2005). Recently, the early peopling of Europe has been 
dated approximately 45,000 ybp and many more corrections on the previous 


dating have been put forward (Callaway, 2012). 


The DNA-based studies on Indian populations began during early 1990s. 
However, some of the initial studies dealt with populations, which are neither 
anthropologically well-defined nor were really representative Indian 
populations (Semino ef al., 1991; Passarino et al., 1992; Soodyall & Jenkins 
1992; Barnabas ef al., 1996). Mountain et al. (1995) were probably the first, 


who tried to deal with demographic history of India, based on sequencing of 


2) 


Introduction 


the mitochondrial control (D-loop) region. In a study dealing with the 9 bp 
deletion located in the mitochondrial genome among a number of tribal and 
caste populations of southern India, Watkins et al., (1999) suggested multiple 
origin of 9 bp deletion in southern India, indicating the heterogeneity among 
the Indians. The traces of socio-cultural, linguistic physiographical boundaries 
and evolutionary forces leading to diversity are well documented in the recent 
studies. The most accepted and proven view on Indians is that peopling of 
India is very ancient along with recent gene flow from west and east Eurasia 
(Kivisild et al. 1999; Bamshad et al. 2001; Misra 2001; Basu et al. 2003; 
Thangaraj et al., 2003; Thangaraj et al., 2005; Underhill et al. 2010; Chaubey 
et al. 2008, 2011; Chandrasekar et al. 2009). The vast majority (> 98%) of the 
Indian maternal gene pool, consisting of Indio-European and Dravidian 
speakers, is genetically more or less uniform. Invasions after the late 
Pleistocene settlement might have been mostly male-mediated. However, Y- 
SNP data provides compelling genetic evidence for a tribal origin of the lower 
caste populations in the subcontinent. Lower caste groups might have 
originated with the hierarchical divisions that arose within the tribal groups 
with the spread of Neolithic agriculturalists, much earlier than the arrival of 
Aryan speakers. The Indo-Europeans established themselves as upper castes 
among this already developed caste-like class structure within the tribes 
(Thangaraj et al., 2006; Thangaraj et al., 2010; Sengupta et al., 2006; 
Eaaswarkhanth et al., 2010). In the last decade, several studies have also been 
carried out to study the origin of various Indian castes (Thangaraj et al., 2007; 
Frank et al., 2008; Mukherjee et al., 2009; Nair et a/., 2011; Khurana et al., 
2014). 


23 


Literature Review 


2.1 HUMAN GENOME PROJECT 


DNA (deoxyribonucleic acid) in its present form was first discovered by 
Friedrich Miescher (1869). Its composition and bio-chemical nature were 
enunciated by Levene (1919). The double helical structure of DNA was 
discovered by Watson, Crick and Wilkins (1953). The 50° anniversary of 
DNA structure discovery was marked by successful completion of Human 
Genome Project (Lander et al., 2001; Venter et al., 2003). The Human 
Genome Project was a 13-year-long, publicly funded project initiated in 1990 
with the objective of determining the DNA sequence of the entire human 
genome. In its early days, the Human Genome Project was met with 
skepticism by many people, including scientists and nonscientists alike. One 
prominent question was whether the huge cost of the project would outweigh 
the potential benefits. Today, however, the overwhelming success of the 
Human Genome Project is readily apparent. Not only did the completion of 
this project usher in a new era in genomics, but it also led to significant 
advances in the types of technology used to sequence DNA (Waterson ef al., 
2003). Humans are identical over most of their genomes. Thus, only a 
relatively small number of genetic differences have resulted in the striking 
variation seen among individuals of our species. This phenotypic variation 
among humans was the subject of a recent study by Luis B. Barreiro and his 
colleagues at the Pasteur Institute in Paris (Barreiro ef al., 2008). In particular, 
Barreiro and his colleagues were interested in how natural selection has led to 


phenotypic differences. 
When we think of variation between people, we often think of differences in 


height, weight, and skin color. Each of these characteristics is only partially 


controlled by genes. The complex interaction between genes and the 


24 


Literature Review 


environment, as well as between multiple genes, makes trying to understand 
and quantify human phenotypic variation difficult. Therefore, instead of 
looking at complex human traits, Barreiro and his colleagues went straight to 
the source and looked for nucleotide sequences in the genome that could tell 
them about individual human variation. For this study, the identification of 
single base changes (single nucleotide polymorphisms, or SNPs) was 
considered ideal. Barreiro and his colleagues obtained data for their research 
from the HapMap project, an international consortium that has built a vast and 
growing repository of human genetic variation. To date, the project has 
analyzed over 3.1 million SNPs across the human genome common to 270 
individuals of African, Asian, and European ancestry (International HapMap 


Consortium, 2003, 2005). 


A SNP is a variation of a single nucleotide between individuals. These 
polymorphisms can therefore be used to discern small differences both within 
a population and among different populations (Ramana et al., 2001). The 
beauty of SNPs is that the observed variation can be followed over time and 
quantified. If SNPs change either the function of a gene or its expression, and 
the change provides greater fitness for a population (i.e., a higher capacity to 
survive and/or reproduce in a given environment), the change will be favored 
by natural selection. Therefore, SNPs can be the basis of evolutionary change. 


This was the basic premise of Barreiro's study. 


Simple tandem-repetitive regions of DNA (or ‘minisatellites’) which are 
dispersed in the human genome frequently show substantial length 
polymorphism arising from unequal exchanges which alter the number of 
short tandem repeats in a minisatellite. The repeat elements in a subset of 


human minisatellites share a common 10-15base-pair (bp) ‘core’ sequence 


20 


Literature Review 


which might act as a recombination signal in the generation of these 
hypervariable regions (Ludwig et al., 1989). A hybridization probe consisting 
of the core repeated in tandem can detect many highly polymorphic 
minisatellites simultaneously to provide a set of genetic markers of general 
use in human linkage analysis. Other variant probes can detect additional sets 
of hypervariable minisatellites to produce somatically stable DNA 
‘fingerprints’ which are completely specific to an individual (or to his or her 
identical twin) and can be applied directly to problems of human 
identification, including parenthood testing (Jeffreys et al., 1985; Jeffreys et 
al., 1986; Jeffreys et al., 1988). 


2.2 REPEATED DNA SEQUENCES & GENETIC MARKERS 


Since it has been estimated that over 99.7% of the human genome is the same 
from individual to individual, regions that differ need to be found in the 
remaining 0.3% in order to tell people apart at the genetic level. There are 
many repeated DNA sequences scattered throughout the human genome. As 
these repeat sequences are typically located between genes, they can vary in 
size from person to person without impacting the genetic health of the 


individual (Venter et al., 2003; Jobling, 2012). 


Human genomes are full of repeated DNA sequences (Ellegren, 2004). These 
repeated DNA sequences come in all sizes and are typically designated by the 
length of the core repeat unit and the number of contiguous repeat units or the 
overall length of the repeat region. Long repeat units may contain several 
hundred to several thousand bases in the core repeat. These regions are often 
referred to as satellite DNA and may be found surrounding the chromosomal 


centromere. The term satellite arose due to the fact that frequently one or more 


26 


Literature Review 


minor “satellite bands” were seen in early experiments involving equilibrium 
density gradient centrifugation (Britten & Kohne, 1968; Primrose, 1998). The 
core repeat unit for a medium-length repeat, sometimes referred to as a 
minisatellite or a VNTR (variable number of tandem repeats), is in the range 
of approximately 8 base pairs (bp) to 100bp in length (Nakamura et al., 1987; 
Boerwinkle et al., 1989; Odelberg et al., 1989; Tautz, 1993). The most 
commonly used minisatellite marker in the 1990s was D1S80, which has a 
16bp repeat unit and contains alleles spanning the range of 14 to 41 repeat 
units (Kasai et al., 1990; Budowle ef al., 1991; Butler, 2010). DNA regions 
with repeat units that are 2bp to 7bp in length are called microsatellites, 
simple sequence repeats (SSRs), or most usually short tandem repeats (STRs). 
STRs have become popular DNA repeat markers because they are easily 
amplified by the polymerase chain reaction (PCR) without the problems of 
differential amplification (Horn et al., 1989; Kimpton et al., 1993; Kimpton et 
al., 1994; Kimpton et al., 1996). This is because both alleles from a 
heterozygous individual are similar in size since the repeat size is small. The 
number of repeats in STR markers can be highly variable among individuals, 
which makes these STRs effective for human identification purposes (Litt & 
Lutty, 1989). Literally thousands of polymorphic microsatellites have been 
characterized in human DNA and there may be more than a million 
microsatellite loci present depending on how they are counted (Ellegren, 
2004). Regardless, microsatellites account for approximately 3% of the total 
human genome (International Human Genome Sequencing Consortium, 
2001). STR markers are scattered throughout the genome and occur on 
average every 10,000 nucleotides (Edwards et al., 1991). However, not all 
STR loci exhibit variability between individuals. Computer searches of the 
recently available human genome reference sequence have cataloged the 


number and nature of STR markers in the genome (Gill, 2002). A large 


27 


Literature Review 


number of STR markers have been characterized by academic and commercial 
laboratories for use in disease gene location studies (Broman et al., 1998; 
Ghebranious et al., 2003). To perform analysis on STR markers, the invariant 
flanking regions surrounding the repeats must be determined. Once the 
flanking sequences are known then PCR primers can be designed and the 
repeat region amplified for analysis. New STR markers are usually identified 
in one of two ways: (1) searching DNA sequence databases such as GenBank 
for regions with more than six or so contiguous repeat units (Weber & May, 
1989; Collins et al., 2003; Subramanian ef al., 2003); or (2) performing 
molecular biology isolation methods (Edwards et al., 1991; Chambers & 
MacAvoy, 2000). 


2.2.1 FORENSIC DNA TYPING 

For human identification purposes it is important to have DNA markers that 
exhibit the highest possible variation or a number of less polymorphic markers 
that can be combined in order to obtain the ability to discriminate between 
samples. Forensic specimens are often challenging to PCR amplify because 
the DNA in the samples may be severely degraded. Mixtures are prevalent as 
well in some forensic samples, such as those obtained from sexual assault 
cases containing biological material from both the perpetrator and victim 
(Griffiths et al., 1998; Gonzalez et al., 2001; Hanson & Ballantyne, 2007). 
The small size of STR alleles (2bp to 7bp) compared to minisatellite VNTR 
alleles (400bp to 1000bp) make the STR markers better candidates for use in 
forensic applications where degraded DNA is common (Bar et al., 1997). PCR 
amplification of degraded DNA samples can be better accomplished with 
smaller product sizes. These reduced-size STR amplicons are often referred to 
as miniSTRs. Allelic dropout of larger alleles in minisatellite markers caused 


by preferential amplification of the smaller allele is also a significant problem 


28 


Literature Review 


with minisatellites. Furthermore, single-base resolution of DNA fragments can 
be obtained more easily with sizes below 500bp using high-resolution 
capillary electrophoresis. Thus, for both biology and technology reasons the 
smaller STRs are advantageous compared to the larger minisatellite (VNTRs). 
Among the various types of STR systems, tetranucleotide repeats have 
become more popular than di- or trinucleotides. Penta- and hexanucleotide 
repeats are less common in the human genome but are being examined by 
some laboratories (Hammond et al., 1994). A biological phenomenon known 
as “stutter” results when STR alleles are PCR amplified. Stutter products are 
amplicons that are typically one or more repeat units less in size than the true 
allele and arise during PCR because of strand slippage (Walsh et al., 1996). 
Stutter product amounts vary depending on the STR locus and even the length 
of the allele within the locus but are usually less than 15% of the allele 
product quantity with tetranucleotide repeats. With di- and trinucleotides, the 
stutter percentage can be much greater (30% or more) making it difficult to 
interpret sample mixtures. In addition, the four-base spread in alleles with 
tetranucleotides makes closely spaced heterozygotes easier to resolve with 
size-based electrophoretic separations compared to alleles that could be two or 
three bases different in size with dinucleotide and trinucleotide markers, 
respectively (Kirby, 1992; Pascali et al., 1998; Grignani ef al., 2000; Bieber et 
al., 2006). 


2.2.2 TYPES OF STR MARKERS 

STR repeat sequences are named by the length of the repeat unit. Dinucleotide 
repeats have two nucleotides repeated next to each other. Trinucleotides have 
three nucleotides in the repeat unit, tetranucleotides have four, 
pentanucleotides have five, and hexanucleotides have six nucleotides in the 


core repeat. However, because microsatellites are tandemly repeated, some 


29 


Literature Review 


motifs are actually equivalent to others. STRs are often divided into several 
categories based on the repeat pattern. Simple repeats contain units of 
identical length and sequence, compound repeats comprise two or more 
adjacent simple repeats, and complex repeats may contain several repeat 
blocks of variable unit length as well as variable intervening sequences 
(Urquhart et al., 1994). Complex hypervariable repeats also exist with 
numerous non-consensus alleles that differ in both size and sequence and are 
therefore challenging to genotype reproducibly (Urquhart er al., 1993; Gill et 
al., 1994). This last category of STR markers is not as commonly used in 
forensic DNA typing due to difficulties with allele nomenclature and 
measurement variability between various laboratories, although several 
commercial kits now include the complex hypervariable STR locus SE33, 


sometimes called ACTBP2 (Urquhart et al., 1993). 


2.2.3 Y-CHROMOSOME MARKERS 

The Y-chromosome and mitochondrial DNA (mtDNA) markers are known as 
“lineage markers.” They are passed down from generation to generation 
without changing (except for mutational events). Maternal lineages can be 
traced with mitochondrial DNA sequence information while paternal lineages 
can be followed with Y-chromosome markers (Graves, 1995; Graves et al., 
1998; Bower, 2000, 2003; Brown, 2002). With lineage markers, the genetic 
information from each marker is referred to as a haplotype rather than a 
genotype because there is usually only a single allele per individual. Because 
Y-chromosome markers are linked on the same chromosome and are not 
shuffled with each generation, the statistical calculations for a random match 
probability cannot involve the product rule. Therefore, haplotypes obtained 
from lineage markers can never be as effective in differentiating between two 


individuals as genotypes from autosomal markers that are unlinked and 


30 


Literature Review 


segregate separately from generation to generation (Jegalian & Lahn, 2001). 
However, Y-chromosome, mitochondrial DNA, and X-chromosome markers 
can play an important role in forensic investigations as well as other human 
identification applications (Lahn ef al., 2001; Gill et al., 2001). Y- 
chromosome analysis is also being utilized in anthropological investigation 
(human migration studies) and it can lead to accurate estimation of TMRCA 


(Poznik et al., 2013). 


A detailed analysis of the “finished” reference Y-chromosome sequence was 
described in the June 19, 2003 issue of Nature by researchers from the 
Whitehead Institute and Washington University. Although it is stated as being 
a “finished” sequence, Skaletsky et al., (2003) report on only 23Mb of the 
roughly S5OMb present in a typical human Y-Chromosome. The unreported 
and as yet unsequenced 30Mb portion is a heterochromatin region located on 
the long arm of the Y-chromosome that is not transcribed and is composed of 
highly repetitive sequences, which are impossible to sequence reliably with 
current technology. At 50Mb, the Y-chromosome is the third smallest human 
chromosome only slightly larger than chromosome 21 (47Mb) and 
chromosome 22 (49Mb). The tips of the Y-chromosome, which are called the 
pseudo-autosomal regions (PAR), recombine with their sister sex X- 
chromosome homologous regions. PARI located at the tip of the short arm 
(Yp) of the Y-chromosome is approximately 2.5Mb in length while PAR2 at 
the tip of the long arm (Yq) is less than 1Mb in size (Graves et al., 1998). The 
remainder of the Y-chromosome (95%) is known as the non-recombining 
portion of the Y-chromosome, or NRY. The NRY remains the same from 
father to son unless a mutation occurs. Some authors term the NRY the male- 
specific region (MSY) because of evidence of frequent gene conversion or 


intra-chromosomal recombination (Skaletsky ef al., 2003). A total of 156 


31 


Literature Review 


known transcription units including 78 protein-coding genes are present on 
MSY. Many sequences in the Y-chromosome are highly duplicated either 
with themselves or with the X-chromosome. Three classes of sequences have 
been characterized in the Y-chromosome: X-transposed, X-degenerate, and 
ampliconic (Skaletsky et al., 2003). Two blocks on the short arm of Y- 
chromosome with a combined length of 3.4Mb make up the X-transposed 
sequences. These sequences are 99% identical to sequences found in Xq21, 
contain two coding genes, and do not participate in X—Y crossing over during 
male meiosis. X-degenerate segments of MSY occur in eight blocks on both 
the short arm and the long arm of the Y-chromosome with an aggregate length 
of 8.6Mb. These X-degenerate segments possess up to 96% nucleotide 
sequence identity to their X-linked homologues. These X-homologous regions 
can make it challenging to design Y-chromosome assays that generate male- 
specific DNA results. If portions of an X-homologous region of the Y- 
chromosome are examined inadvertently, then female DNA, which possesses 
two X-chromosomes, will be detected. Thus, when testing Y-chromosome- 
specific assays it is important to examine them in the presence of female DNA 
(high levels) to verify that there is little-to-no cross talk with X-homologous 
regions of the Y-chromosome (Butler ef al., 2002, Hall & Ballantyne, 2003). 
The ampliconic segments are composed of seven large blocks scattered across 
both the short arm and the long arm and covering about 10.2Mb of the Y- 
chromosome (Skaletsky ef al., 2003). Some 60% of these ampliconic 
sequences have intrachromosomal identities of 99.9% or greater. In other 
words, it is very difficult to tell these sequences apart from one another. 
Another interesting feature of these ampliconic segments is that many of them 
are palindromes-that is, the almost exact duplicate sequences are inverted with 
respect to each other’s sequence essentially as mirror images. Eight large 


palindromes collectively comprise 5.7Mb of Yq with at least six of these 


32 


Literature Review 


palindromes containing testis genes. Genetic markers within these 
palindromic regions will exist as multi-copy PCR products from single primer 
sets. For example, the DAZ (deleted in azospermic) gene occurs in four copies 
at 24Mb along the reference sequence (Saxena et al., 1996, Skaletsky et al., 
2003; Prinz, 2003; Melissa et al., 2014). 


2.2.3.1 Minimal Haplotype Loci 

The number of Y-STR loci available for use in human identity testing has 
increased dramatically since the turn of the century and the availability of the 
human genome sequence. In the 1990s only a handful of Y-STR markers were 
characterized and available for use and only about 30 Y-STRs were available 
for researchers (Butler, 2003) at the beginning of 2002. These Y-STRs have 
been cataloged and mapped to their Y-chromosome positions (Hanson & 
Ballantyne, 2006). Yet even with a limited number of loci available at the 
time, a core set was selected in 1997 that continue to serve as “minimal 
haplotype” loci (Kayser ef al., 1997, Pascali et al., 1998). The minimal 
haplotype is defined by the single copy Y-STR loci DYS19, DYS389I, 
DYS389II, DYS390, DYS391, DYS392, DYS393, and the highly 
polymorphic multi-copy locus DYS385 a/b (Schneider et al., 1998). By means 
of a multicenter study, more than 4000 male DNA samples from 48 different 
subpopulation groups were studied with the single copy loci in the minimal 
haplotype set (de Knijff et al., 1997). This work formed the basis for what is 
now the online Y-STR Haplotype Reference Database (http://www.yhrd.org) 


that will be described in more detail below. 


In January 2003, the U.S. Scientific Working Group on DNA Analysis 
Methods (SWGDAM) recommended use of the minimal haplotype loci plus 
two additional single copy Y-STRs: DYS438 and DYS439 (Ayub et al., 


33 


Literature Review 


2000). Information regarding these core loci and other loci present in 
commercial Y-STR kits may be found in Table. Although other Y-STRs may 
be added to databases as their value is demonstrated and they become part of 
commercially available kits, the original minimal haplotype loci and 
SWGDAM recommended Y-STRs are likely to dominate human identity 


applications in the coming years. 


2.2.3.2 Y-STR Nomenclature 

The DNA Commission of the International Society of Forensic Genetics 
(ISFG) has made a series of recommendations on the use of Y-STR markers 
(Carvalho-Silva et al., 1999; Gill et al., 2001, Gusmao et al., 2006). Their 
recommendations address allele nomenclature, use of allelic ladders, 
population genetics, and reporting methods. The ISFG recommendations for 
Y-STR allelic ladders include the following: (a) the alleles should span the 
distance of known allelic variants for a particular locus, (b) the rungs of the 
ladder should be one repeat unit apart wherever possible, (c) the alleles 
present in the ladder should be sequenced, and (d) the ladders should be 
widely available to enable reliable interlaboratory comparisons. The existence 
of commercially available Y-STR kits has now facilitated the widespread use 
of consistent allelic ladders. Prior to commercially available Y-STR kits and 
consistent allelic ladders, various researchers in the field took different 
approaches to naming alleles. For some loci there were instances of multiple 
published designations for the same allele. An example of this phenomenon 
that illustrates the importance of standardization is DYS439, which has been 
designated three different ways in the literature. In an effort to provide a 
unified nomenclature for STR loci, a comparative analysis of the repeat and 
sequence structure of Y-chromosome markers in humans and chimpanzees 


has been proposed and 11 human Y-STRs have been studied (Gusmao et al., 


34 


Literature Review 


2002). Since the chimpanzees examined in their study did not vary in the other 
regions outside of the variable core GATA repeat for DYS439, Gusmao et al. 
(2002) proposed a [GATA]n repeat structure for humans. This nomenclature 


has now been adopted for all commercial STR kits typing DYS439. 


2.2.3.3 Y-STR Kits 

As noted in Chapter 5, forensic scientists rely heavily on commercially 
available kits to perform DNA testing. Thus, many laboratories especially in 
the U.S. were reluctant to move into Y-STR typing until Y-STR kits were 
offered. Two most widely used Y-STR kits are PowerPlex Y (Promega 
Corporation) and Yfiler (Applied Biosystems). All of the European and U.S. 
core Y-STR loci are included in both kits. PowerPlex Y contains one 
additional locus (DYS437) and Yfiler has six additional loci (DYS437, 
DYS448, DYS456, DYS458, DYS635, and GATA-H4). Until 2005, 
ReliaGene Technologies (formerly of New Orleans, LA) sold the Y-PLEX 12 
kit, which amplified the SWGDAM recommended loci plus the amelogenin 
marker. Reliagene had also supplied Y-PLEX 6 and Y-PLEX 5 kits (Sinha et 
al., 2004), which were precursors to the Y-PLEX 12 kit. Inclusion of 
amelogenin enables confirmation that the PCR reaction has not failed on 
female DNA samples since a single X amplicon will result. In addition, 
mixture levels of male and female DNA can be confirmed in many situations 
with the amelogenin X and Y peak height ratios. While the amelogenin 
primers provide a measure of quality control on PCR amplifications, they 
have the disadvantage of possibly tying up and consuming PCR reagents 


when high levels of female DNA are present in a mixture. 


35 


Literature Review 


Table 2.1: Characteristics of Commonly Used Y-STR Loci (Butler, 2006; 
Decker et al., 2007). 


STR Position Repeat Allele Mutation 
Marker (Mb) Motif Range Rate 
DYS393 | 3.19 AGAT 8-17 0.10% 
DYS456 | 4.33 AGAT 13-18 0.42% 
DYS458 | 7.93 GAAA 14-20 0.64% 
DYS19 10.13 TAGA 10-19 0.23% 
DYS391 | 12.61 TCTA 6-14 0.26% 
DYS635__| 12.89 TSTA 17-27 0.35% 
DYS437 | 12.98 TCTR 13-17 0.12% 
DYS439 | 13.03 AGAT 8-15 0.52% 
DYS389 | 13.12 TCTR 9-17/24— | 0.25%/0.36% 
IAI 34 

DYS438 | 13.38 TITTC 6-14 0.03% 
DYS390 | 15.78 TCTR 17-28 0.21% 
Y-GATA- | 17.25 TAGA 8-13 0.24% 
H4 

DYS385 | 19.26 GAAA 7-28 0.21% 
a/b 

DYS392 | 21.04 TAT 6-20 0.04% 
DYS448 | 22.78 AGAGAT | 17-24 0.16% 


Y-chromosome DNA testing is important for a number of different 
applications of human genetics including forensic evidence examination, 
paternity testing, historical investigations, studying human migration patterns 


throughout history, and genealogical research. In terms of forensic 


36 


Literature Review 


applications, there are both advantages and limitations to Y-chromosome 
testing. The primary value of the Y-chromosome in forensic DNA testing is 
that it is found only in males. The SRY (sex-determining region of the Y) 
gene determines maleness. Since a vast majority of crimes where DNA 
evidence is helpful, particularly sexual assaults, involve males as the 
perpetrators, DNA tests designed to only examine the male portion can be 
valuable. With Y-chromosome tests, interpretable results can be obtained in 
some cases where autosomal tests are limited by the evidence, such as high 
levels of female DNA in the presence of minor amounts of male DNA. These 
situations include sexual assault evidence from azospermic or vasectomized 
males and blood—blood or saliva—blood mixtures where the absence of sperm 
prevents a successful differential extraction for isolation of male DNA (Prinz 
& Sansone, 2001). In addition, the number of individuals involved in a “gang 
rape” may be easier to decipher with Y-chromosome results than with highly 
complicated autosomal STR mixtures. Using Y-chromosome-specific PCR 
primers can improve the chances of detecting low levels of the perpetrator’s 
DNA in a high background of a female victim’s DNA (Hall & Ballantyne 
2003). Y-chromosome tests have also been used to verify amelogenin Y- 


deficient males (Thangaraj et al., 2002). 


The same feature of the Y-chromosome that gives it an advantage in forensic 
testing, namely maleness, is also its biggest limitation. A majority of the Y- 
chromosome is transferred directly from father to son without recombination 
to shuffle its genes and provide greater genetic variety to future generations. 
Random mutations are the only mechanisms for variation over time between 
paternally related males. Thus, while exclusions in Y-chromosome DNA 
testing results can aid forensic investigations, a match between a suspect and 


evidence only means that the individual in question could have contributed the 


37 


Literature Review 


forensic stain-as could a brother, father, son, uncle, paternal cousin, or even a 
distant cousin from his paternal lineage. Needless to say, inclusions with Y- 
chromosome testing are not as meaningful as autosomal STR matches from a 
random match probability point-of-view (de Knjiff 2003). On the other hand, 
the presence of relatives having the same Y-chromosome expands the number 
of possible reference samples in missing persons’ investigations and mass 
disaster victim identification efforts. Y-chromosome testing also aids familial 
searching (Dettlaff-Kakol & Pawlowski 2002, Sims ef al., 2008). Deficient 
paternity tests where the father is dead or unavailable for testing are benefited 
if Y-chromosome markers are used (Santos eft al., 1993). However, an 
autosomal DNA test is always preferred when possible since it provides a 
higher power of discrimination. The Y-chromosome has also become a 
popular tool for tracing historical human migration patterns through male 
lineages (Jobling & Tyler-Smith, 1995, 2003). Anthropological, historical, 
and genealogical questions can be answered through Y-chromosome results. 
For example, Y-chromosome results in 1998 linked modern-day descendants 
of Thomas Jefferson and Eston Hemings leading to the controversial 


conclusion that Jefferson fathered the slave (Foster et al., 1998). 


2.2.3.4 Y-STR Haplotype Databases 

A number of online Y-STR databases exist. The forensic databases contain 
collections of anonymous individuals and can be used to estimate the 
frequency of specified Y-STR haplotypes. The genetic genealogy databases, 
such as Y-search and Y-base, contain Y-STR haplotype information gathered 
by genetic genealogy companies with different sets of loci from males trying 
to make genealogical connections. Thus, the haplotypes in these genealogy 


databases are associated with specific individuals and family names. 


38 


Literature Review 


YHRD 

The largest and most widely used forensic and general population genetics Y- 
STR database, known as the Y-STR Haplotype Reference Database (YHRD), 
was created by Lutz Roewer and colleagues at Humbolt University in Berlin, 
Germany, and has been available online since 2000 (Roewer, 2003; Willuweit 
& Roewer, 2007). As of 2014, YHRD contains results from more than 1, 89, 
000 samples with minimal haplotype loci results representing 710 different 
groups of sample submissions from various populations and countries around 
the world. Searches on YHRD may be conducted by population group or 


geographic location. 


39 


Literature Review 


2.3 FORENSIC & POPULATION GENETICS STUDIES 
Y-STR polymorphisms were first discovered in 1980s (Tautz, 1989; 
Malaspina et al. 1990). Some researchers described that, STR 
polymorphisms appear to occur less frequent on the Y chromosome compared 
with autosomes (Spurdle and Jenkins 1992). 

Roewer et al., (1992) discussed one tetrameric simple repeat polymorphism 
mapped to Yp (DYS19). Three dimeric Y-STR loci (YCAI, YCAII, YCATID) 
polymorphism have been described Mathias et al., (1994). These STRs show 
moderate levels of polymorphism and are used for routine forensic as well as 
for anthropological applications (Roewer & Epplen, 1992; Roewer ef al., 
1993; Gomolka et al., 1994; Mathias et al., 1992). 


Roewer et al., (1996) utilized Y-chromosomal STR polymorphisms for male 


identification. 


A multicenter study was carried out to characterize 13 polymorphic short 
tandem repeat (STR) systems located on the male specific part of the human 
Y chromosome (DYS19, DYS288, DYS385, DYS388, DYS389I/II, DYS390, 
DYS391, DYS392, DYS393, YCAI, YCAIL, YCATI, DXYSI56Y) (Kayser et 
al., 1997). Amplification parameters and electrophoresis protocols including 
multiplex approaches were compiled. The typing of non-recombining Y loci 
with uni-parental inheritance requires special attention to population sub- 
structuring due to prevalent male lineages. To assess the extent of these sub- 
heterogeneities up to 3825 unrelated males were typed in up to 48 population 
samples for the respective loci. A consistent repeat based nomenclature for 
most of the loci has been introduced. They estimated the average mutation 


rate for DYS19 in 626 confirmed father-son pairs as 3.2 x 10 (95% 


40 


Literature Review 


confidence interval limits of 0.00041-0.00677), a value which can also be 
expected for other Y-STR loci with similar repeat — structure. 
Recommendations are given for the forensic application of a basic set of 7 
STRs (DYS19, DYS3891, DYS389II, DYS390, DYS391, DYS392, and 
DYS393) for standard Y-haplotyping in forensic and paternity casework. 
They further recommend the inclusion of the highly polymorphic bilocal Y- 
STRs DYS385, YCAI, YCATII for a nearly complete individualization of 


almost any given unrelated male individual. 


To facilitate evolutionary and forensic studies of DNA polymorphisms on the 
Y chromosome, multiplex DNA typing technique was devised for four 
tetranucleotide STR loci (DYS19, DYS390, DYS391, and DYS393) (Redd et 
al., 1997). These Y-STR loci were simultaneously amplified with FAM- 
labeled primers and genotypes were determined with an automated DNA 
sequencer. They typed 162 males from three U.S. populations (African- 
Americans, European-Americans and Hispanics) and found that the haplotype 
diversities range from 0.920 to 0.969. This quadruplex system provides a 
facile means of genotyping these Y chromosome STRs, and should be useful 


in population genetic and forensic applications. 


Seven novel microsatellite markers were developed by White ef al., (1999). 
These microsatellites are tetranucleotide GATA repeats and are polymorphic 
among unrelated individuals. Five of the seven markers were male-specific, 
with no PCR product being generated from female DNA. The remaining 
markers were polymorphic in both males and females with many shared 


alleles between the sexes. 


41 


Literature Review 


Underhill et al., (2000) carried out a study on Y-chromosome sequence 
variation and the history of human populations. The study was comprised of 
binary polymorphisms associated with the non-recombining region of the 
human Y chromosome (NRY), which preserves the paternal genetic legacy of 
our species that has persisted to the present, permitting inference of human 
evolution, population affinity and demographic history. They used denaturing 
high-performance liquid chromatography (DHPLC) to identify 160 of the 166 
bi-allelic and 1 tri-allelic site that formed a parsimonious genealogy of 116 
haplotypes, several of which display distinct population affinities based on the 
analysis of 1062 globally representative individuals. Results of the study 
suggested that, a minority of contemporary East Africans and Khoisan 
represent the descendants of the most ancestral patrilineages of anatomically 


modern humans that left Africa between 35,000 and 89,000 years ago. 


1.33 Mb of sequence from the human Y chromosome was analyzed for tri- to 
hexanucleotide microsatellites (Ayub et al., 2000). Twenty loci containing a 
stretch of eight or more repeat units with complete repeat sequence 
homogeneity were found, 18 of which were novel. Six loci (one tri-, four 
tetra- and one pentanucleotide) were assembled into a single multiplex 
reaction and their degree of polymorphism was investigated in a sample of 
278 males from Pakistan. Diversities of the individual loci ranged from 0.064 
to 0.727 in Pakistan, while the haplotype diversity was 0.971. One population, 
the Hazara, showed particularly low diversity, with predominantly two 


haplotypes. 


The reference database of highly informative Y-chromosomal short tandem 
repeat (STR) haplotypes (YHRD) was devised by Roewer et al., (2001). By 
September 2014, YHRD contained 136,184 9-locus ("minimal haplotypes"), 


42 


Literature Review 


40% of which have been extended further to include two additional loci. 
Establishment of YHRD has been facilitated by the joint efforts of various 
forensic and anthropological institutions. 

Kayser and Sajantila (2001) studied mutations at Y-STR loci and its 
implications for paternity testing and forensic analysis. Knowledge about 
mutation rates and the mutational process of Y-chromosomal short-tandem- 
repeat (STR) or microsatellite loci used in paternity testing and forensic 
analysis is crucial for the correct interpretation of resulting genetic profiles. 
They analyzed a total of 4999 male germline transmissions from father/son 
pairs of confirmed paternity (99.9%) at 15 Y-STR loci. They identified 14 
mutations. Locus specific mutation rate estimates varied between 0 and 8.58 x 
10‘, and the overall average mutation rate estimate was 2.80 x 10. In two 
confirmed father/son pairs, mutations at two Y-STRs were observed. The 
probability of two mutations occurring within the same single germline 
transmission was estimated to be statistically not unexpected. Additional 
alleles caused by insertion polymorphisms were found at a number of Y-STRs 
and a frequency of 0.12% was estimated for DYS19. The observed mutational 
features for Y-STRs have important consequences for forensic applications 
such as the definition of criteria for exclusions in paternity testing and the 


interpretation of genetic profiles in stain analysis. 


DNA Commission of the International Society of Forensic Genetics (Gill et 
al., 2001) published a series of documents providing guidelines and 
recommendations concerning the application of DNA polymorphisms to the 
problems of human identification. This report addressed a relatively new area 
- namely, Y-chromosome polymorphisms, with particular emphasis on short 
tandem repeats (STRs) including nomenclature, use of allelic ladders, 


population genetics and reporting methods. 


43 


Literature Review 


In the field of molecular diagnosis, forensic casework analysis is one of the 
most demanding investigations, due to its social impact. Optimization of DNA 
typing multiplex reactions with identical cycling conditions as those required 
by autosomal short tandem repeats (STR) multiplex reduces errors, and saves 
time and reagents. Corach et al. (2001) started Y-STR typing in routine 
forensic casework. They validated a five Y-STRs set for a multiplex PCR 
reaction (a triplex for DYSI19, DYS390 and DYS391 and a duplex for 
DYS392 and DYS393). Statistical attributes of the haplotypes of the five Y- 
STR investigated were evaluated in unrelated males from different 


metropolitan areas of Argentina. 


Reliable amplification of short tandem repeat (STR) DNA markers with the 
polymerase chain reaction (PCR) is dependent on high quality PCR primers. 
The particular primer combinations and concentrations are especially 
important with multiplex amplification reactions where multiple STR loci are 
simultaneously copied. Commercially available kits are now widely used for 
STR amplification and subsequent DNA typing. They presented the use of 
high performance liquid chromatography (HPLC) and time-of-flight mass 
spectrometry (TOF-MS) methods for characterization of commercially 
available STR kits. Butler et al., (2001) conducted a series of quality control 


test of PCR primers used in multiplex STR amplification reactions. 


Copying multiple regions of a DNA molecule is routinely performed today 
using the polymerase chain reaction (PCR) in a process commonly referred to 
as multiplex PCR. The development of a multiplex PCR reaction involves 
designing primer sets and examining various combinations of those primer 
sets and different reaction components and/or thermal cycling conditions. The 


process of optimizing a multiplex PCR reaction in order to obtain a well- 


44 


Literature Review 


balanced set of amplicons can be time-consuming and labor-intensive. The 
rapid separation and quantification capabilities of capillary electrophoresis 
make it an efficient technique to help in the multiplex PCR optimization 
process. Butler et al., (2001) utilized capillary electrophoresis as a tool for 


optimization of multiplex PCR reactions 


Nineteen Y-specific short tandem repeat (STR) loci have been amplified in 
768 samples from the Iberian Peninsula in order to evaluate their usefulness in 
forensic casework (Bosch ef al., 2002) in three multiplex reactions. Two 
previously published multiplex reactions by Thomas et al., (1999) included 
six Y-STR loci (DYS19, DYS388, DYS390, DYS391, DYS392 and DYS393) 
and by six Y-STR loci (DYS434, DYS435, DYS436, DYS437, DYS438 and 
DYS439) by Ayub et al., (2000). Bosch et al., reported another seven loci 
(DYS385, DYS389, DYS460, DYS461, DYS462 and amelogenin) for this 
study. 


Redd et al. (2002) identified and characterized 14 novel short-tandem-repeats 
(STRs) on the Y chromosome and typed them in two samples, a globally 
diverse panel of 73 cell lines, and 148 individuals from a European—American 
population for forensic purposes. The analyzed Y-STRs include eight 
tetranucleotide repeats (DYS449, DYS453, DYS454, DYS455, DYS456, 
DYS458, DYS459, and DYS464), five pentanucleotide repeats (DYS446, 
DYS447, DYS450, DYS452, and DYS463), and one hexanucleotide repeat 
(DYS448). Sequence data were obtained to designate a repeat number 
nomenclature. The gene diversities of an additional 22 Y-STRs, including the 
most commonly used in forensic databases, were directly compared in the cell 


line DNAs. 


45 


Literature Review 


A multiplex polymerase chain reaction (PCR) assay capable of simultaneously 
amplifying 20 Y chromosome short tandem repeat (STR) markers has been 
developed to aid human identity testing and male population studies by Butler 
et al., (2002). These markers include all of the Y STRs that make up the 
"extended haplotype" used in Europe (DYS19, DYS385, DYS389I/II, 
DYS390, DYS391, DYS392, DYS393, and YCAII) plus additional 
polymorphic Y STRs (DYS437, DYS438, DYS439, DYS447, DYS448, 
DYS388, DYS426, GATA A7.1, and GATA H4). 


A Y-chromosome multiplex polymerase chain reaction (PCR) amplification 
kit, known as Y-PLEX 6, was developed for use in human identification by 
Sinha et al., (2003). The Y-PLEX 6 kit enabled simultaneous amplification of 
six polymorphic short tandem repeat (STR) loci located on the non- 
recombinant region of the human Y-chromosome (DYS393, DYS19, 
DYS38911, DYS390, DYS391, and DYS385). Schoske et al., (2003) 
designed multiplex PCR for the simultaneous amplification of 10 Y- 


chromosome short tandem repeat (STR) loci. 


Two multiplex reactions were developed to amplify 16 Y-STRs (DYS19, 
DYS385, DYS389 I and II, DYS390, DYS391, DYS392, DYS393, DYS437, 
DYS438, DYS439, GATA A7.1, GATA A7.2, GATA A10, GATA C4, 
GATA H4&) (Beleza et al., 2003). 


Two tribal groups from southern India (Chenchus and Koyas) were analyzed 
for variation in mitochondrial DNA (mtDNA), the Y chromosome, and one 
autosomal locus and were compared with six caste groups from different parts 
of India, as well as with western and central Asians (Kivisild et al., 2003). In 


mtDNA phylogenetic analyses, the Chenchus and Koyas coalesce at Indian- 


46 


Literature Review 


specific branches of haplogroups M and N that cover populations of different 
social rank from all over the subcontinent. Coalescence times suggest early 
late Pleistocene settlement of southern Asia and suggest that there has not 
been total replacement of these settlers by later migrations. They found H, L, 
and R2 are the major Indian Y-chromosomal haplo-groups that occur both in 
castes and in tribal populations and are rarely found outside the subcontinent. 
Haplo-group Rla, previously associated with the putative Indo-Aryan 
invasion, was found at its highest frequency in Punjab but also at a relatively 
high frequency (26%) in the Chenchu tribe. This finding, together with the 
higher Rla-associated short tandem repeat diversity in India and Iran 
compared with Europe and central Asia, suggests that southern and western 
Asia might be the source of this haplogroup. Haplotype frequencies of the 
MX1 locus of chromosome 21 distinguish Koyas and Chenchus, along with 
Indian caste groups, from European and eastern Asian populations. Taken 
together, these results show that Indian tribal and caste populations derive 
largely from the same genetic heritage of Pleistocene southern and western 
Asians and have received limited gene flow from external regions since the 
Holocene. The phylogeography of the primal mtDNA and Y-chromosome 
founders suggested that the southern Asian Pleistocene coastal settlers from 
Africa would have provided the inocula for the subsequent differentiation of 


the distinctive eastern and western Eurasian gene pools. 


Basu ef al., (2003) analyzed 58 DNA markers (mitochondrial [mt], Y- 
chromosomal, and autosomal) and sequence data of the mtHVS1 from a large 
number of ethnically diverse populations of India in order to study the 
peopling structure. The resulting genomic evidence suggested that (1) there 
was an underlying unity of female lineages in India, indicating that the initial 


number of female settlers may have been small; (2) the tribal and the caste 


47 


Literature Review 


populations were highly differentiated; (3) the Austro-Asiatic tribals were the 
earliest settlers in India, providing support to one anthropological hypothesis 
while refuting some others; (4) a major wave of humans entered India through 
the northeast; (5) the Tibeto-Burman tribals share considerable genetic 
commonalities with the Austro-Asiatic tribals, supporting the hypothesis that 
they might have shared a common habitat in southern China; (6) the Dravidian 
tribals were possibly widespread throughout India before the arrival of the 
Indo-European-speaking nomads, but retreated to southern India to avoid 
dominance; (7) formation of populations by fission that resulted in founder 
and drift effects have left their imprints on the genetic structures of 
contemporary populations; (8) the upper castes showed closer genetic 
affinities with Central Asian populations, although those of southern India are 
more distant than those of northern India; (9) historical gene flow into India 
has contributed to a considerable obliteration of genetic histories of 


contemporary populations. 


A study of three different Y-specific microsatellites (Y-STRs) in the 
populations from Uttar Pradesh (UP), Bihar (BI), Punjab (PUNJ), and Bengal 
(WB), speaking modern indic dialects with its roots in Indo-Aryan language, 
and from South of India (SI), speaking the South Indian languages with their 
root in Dravidian language, had shown that the predominant alleles observed 
represent the whole range of allelic variation reported in different population 
groups globally. The results indicated that the Indian population is most 
diverse. The study demonstrated that the population groups, housed in eight 
states of the country in different geographic locations, broadly correspond 
with Indo-Aryan and Dravidian language families. Further, analyses based on 


haplotype frequency of different marker loci and gene diversity revealed that 


48 


Literature Review 


none of the population groups had remained isolated from others. High levels 


of haplotype diversity exist in all the clusters of population (Saha et al., 2003). 


Das et al., (2004) studied Y-chromosome STR haplotypes among five 
endogamous population groups from western and southwestern India in an 
attempt to address the issue of genetic variation and the pattern of male gene 
flow. They studied 221 males at three Y-chromosome biallelic loci and 184 
males for the five Y-chromosome STRs. They observed 111 Y-chromosome 
STR haplotypes. An analysis of molecular variance (AMOVA) based on Y- 
chromosome STRs showed that the variation observed between the population 
groups belonging to two major regions (western and southwestern India) was 
0.17%, which was significantly lower than the level of genetic variance 
among the five populations (0.59%) considered as a single group. Combined 
haplotype analysis of the five STRs and the biallelic locus 92R7 revealed 
minimal sharing of haplotypes among these five ethnic groups, irrespective of 
the similar origin of the linguistic and geographic affiliations; this minimal 
sharing indicates restricted male gene flow. As a consequence, most of the 
haplotypes were population specific. Network analysis showed that the 
haplotypes, which were shared between the populations, seem to have 
originated from different mutational pathways at different loci. Biallelic 
markers showed that all five ethnic groups have a similar ancestral origin 


despite their geographic and linguistic diversity. 


Understanding the genetic origins and demographic history of Indian 
populations is important both for questions concerning the early settlement of 
Eurasia and more recent events, including the appearance of Indo-Aryan 
languages and settled agriculture in the subcontinent. Although there is 


general agreement that Indian caste and tribal populations share a common 


49 


Literature Review 


late Pleistocene maternal ancestry in India, some studies of the Y- 
chromosome markers have suggested a recent, substantial incursion from 
Central or West Eurasia. To investigate the origin of paternal lineages of 
Indian populations, 936 Y chromosomes, representing 32 tribal and 45 caste 
groups from all four major linguistic groups of India, were analyzed for 38 
single-nucleotide polymorphic markers. Phylogeography of the major Y- 
chromosomal haplogroups in India, genetic distance, and admixture analyses 
all indicate that the recent external contribution to Dravidian- and Hindi- 
speaking caste groups has been low. The sharing of some Y-chromosomal 
haplogroups between Indian and Central Asian populations is most 
parsimoniously explained by a deep, common ancestry between the two 
regions, with diffusion of some Indian-specific lineages northward. The Y- 
chromosomal data consistently suggest a largely South Asian origin for Indian 
caste communities and therefore argue against any major influx, from regions 
north and west of India, of people associated either with the development of 
agriculture or the spread of the Indo-Aryan language family. The dyadic Y- 
chromosome composition of Tibeto-Burman speakers of India, however, can 
be attributed to a recent demographic process, which appears to have absorbed 
and overlain populations who previously spoke Austro-Asiatic languages 


(Sahoo et al., 2006). 


In order to investigate the genetic consequences of Indian caste system, Zerjal 
et al., (2007) analyzed male-lineage variation in a sample of 227 Indian men 
of known caste, 141 from the Jaunpur district of Uttar Pradesh and 86 from 
the rest of India. They typed 131 Y-chromosomal binary markers and 16 
microsatellites. They found striking evidence for male substructure: in 
particular, Brahmins and Kshatriyas (but not other castes) from Jaunpur each 


show low diversity and the predominance of a single distinct cluster of 


50 


Literature Review 


haplotypes. Their findings confirmed the genetic isolation and drift within the 
Jaunpur upper castes, which may have resulted from founder effects and 


social factors. 


Thangaraj et al., (2007) studied two tribal populations (Halakki and Kunabhi) 
of coastal Uttar Kannada district of Karnataka, with their informed written 
consent. Both the populations are endogamous and they belong to the 
Dravidian linguistic family. Genomic variation was assayed in 171 individuals 
by resequencing approximately 75kb of DNA for an extensive and 
comprehensive study of genetic diversity in 12 genes of the innate immune 
system (Bairagya et al., 2008). Premi et al. (2009) analyzed unique signatures 
of natural background radiation on human Y chromosomes from Kerala, 


India. 


A study was undertaken to determine the extent of diversity at 12 
microsatellite short tandem repeat (STR) loci in seven primitive tribal 
populations of India with diverse linguistic and geographic backgrounds 
(Mukherjee et al., 2009). DNA samples of 160 unrelated individuals were 
analyzed for 12 STR loci by multiplex polymerase chain reaction (PCR). 
Gene diversity analysis suggested that the average heterozygosity was 
uniformly high (>0.7) in these groups and varied from 0.705 to 0.794. The 
Hardy-Weinberg equilibrium analysis revealed that these populations were in 
genetic equilibrium at almost all the loci. The overall G(ST) value was high 
(G(ST) = 0.051; range between 0.026 and 0.098 among the loci), reflecting 
the degree of differentiation/heterogeneity of seven populations studied for 
these loci. The cluster analysis and multidimensional scaling of genetic 


distances reveal two broad clusters of populations, besides Moolu Kurumba 


51 


Literature Review 


maintaining their distinct genetic identity vis-a-vis other populations. The 
genetic affinity for the three tribes of the Indo-European family could be 
explained based on geography and Language but not for the four Dravidian 


tribes. 


A total of 3046 males of Chinese, Malay, Thai, Japanese, and Indian 
population affinity were typed for the Y STR loci DYS19, DYS385 (counted 
as two loci), DYS389I, DYS389II, DYS390, DYS391, DYS392, DYS393, 
DYS437, DYS438, DYS439, DYS456, DYS458, DYS635, DYS448, and Y 
GATA H4 using the AmpFISTR Yfiler kit by Budowle et al. (2009) in order 
to assess the effects of Asian population substructure on Y STR forensic 
analyses. These samples were assessed for population genetic parameters that 
impact forensic statistical calculations. All population samples were highly 
polymorphic for the 16 Y STR markers with the marker DYS385 being the 
most polymorphic, because it is comprised of two loci. Most (2677 out of a 
total of 2806 distinct haplotypes) of the 16 marker haplotypes observed in the 
sample populations were represented only once in the data set. Haplotype 
diversities were greater than 99.57% for the Chinese, Malay, Thai, Japanese, 


and Indian sample populations. 


Giroti et al., (2010) genotyped 48 population samples of Malani individuals 
(Himachal Pradesh, India) for 15 highly polymorphic autosomal STR loci and 
7 Y-STR loci. Balamurugan et al. (2010) analyzed population sample of 154 
unrelated male individuals for Y chromosome STR allelic and haplotype 


diversity in five ethnic Tamil populations from Tamil Nadu, India. 


Nonrecombining Y-chromosomal microsatellites (Y-STRs) are widely used to 
infer population histories, discover genealogical relationships, and identify 


males for criminal justice purposes. Although a key requirement for their 


32 


Literature Review 


application is reliable mutability knowledge, empirical data are only available 
for a small number of Y-STRs thus far. Ballantyne er al., (2010) analyzed 
nearly 2000 DNA-confirmed father-son pairs, covering an overall number of 
352,999 meiotic transfers. Following confirmation by DNA _ sequence 
analysis, the retrieved mutation data were modeled via a Bayesian approach. 
With the 924 mutations at 120 Y-STR markers, a non-significant excess of 
repeat losses versus gains (1.16:1), as well as a strong and significant excess 
of single-repeat versus multirepeat changes (25.23:1), was observed. Although 
the total repeat number influenced Y-STR locus mutability most strongly, 
repeat complexity, the length in base pairs of the repeated motif, and the 


father's age also contributed to Y-STR mutability. 


A forensic Y-STR database generated in the US was compiled with profiles 
containing a portion or complete typing of 16 STR markers DYS19, DYS385, 
DYS389I, DYS389II, DYS390, DYS391, DYS392, DYS393, DYS437, 
DYS438, DYS439, DYS456, DYS458, DYS635, DYS448, and Y GATA H4 
(Ge et al., 2010). 


Linguistic and ethnic diversity throughout the Himalayas suggests that this 
mountain range played an important role in shaping the genetic landscapes of 
the region. Gayden et al., (2011) analyzed 17 Y-chromosomal short tandem 
repeat (Y-STR) loci among unrelated males from three Nepalese populations 
(Tamang, Newar, and Kathmandu) and a general collection from Tibet. The 
latter displays the highest haplotype diversity (0.9990) followed by 
Kathmandu (0.9977), Newar (0.9570), and Tamang (0.9545). The overall 
haplotype diversity for the Himalayan populations at 17 Y-STR loci was 
0.9973, and the corresponding values for the extended (11 loci) and minimal 


(nine loci) haplotypes were 0.9955 and 0.9942, respectively. No Y-STR 


53 


Literature Review 


profiles are shared across the four Himalayan collections at the 17-, 11-, and 
nine-locus resolutions considered, indicating a lack of recent gene flow among 
them. Phylogenetic analyses support our previous findings that Kathmandu, 
and to some extent Newar, received significant genetic influence from India 
while Tamang and Tibet exhibit limited or no gene flow from the 


subcontinent. 


Nair et al., (2011) analyzed 8 short tandem repeat (STR) loci on the Y 
chromosome to analyze the haplotype of the Ezhava population of Kerala, 


south India and to trace the paternal genetic lineage of the population. 


Yadav et al., (2011) analyzed 17 Y-specific STR loci (DYS19, DYS389I, 
DS389II, DYS390, DYS391, DYS392, DYS393, DYS385a/b, DYS437, 
DYS438, DYS439, DYS448, DYS456, DYS458, DYS635 and 
Y_GATA_H4) in 181 unrelated male individuals in the Saraswat Brahmin 
population from three North Indian states. A total of 157 different 17-loci 
haplotypes were identified, 145 of which were unique. The most frequent 
haplotype was detected in nine instances, occurring with a frequency of 


4.97%. 


Regueiro et al., (2012) have analyzed ancestral modal Y-STR haplotype 
shared among Romani and South Indian populations. 161 Y-chromosomes 


from Roma, residing in two different provinces of Serbia, were analyzed. 


Parvathy et al., (2012) had analyzed haplotype data of 17 YSTR markers in 
Kerala nontribal populations. Chennakrishnaiah ef al., (2013) analyzed 
indigenous and foreign Y-chromosomes characterize among the Lingayat and 
Vokkaliga populations of Southwest India. Mukerjee ef al., (2013) studied 
differential pattern of genetic variability at the DXYS156 locus on 


54 


Literature Review 


homologous regions of X and Y chromosomes in Indian population and its 


forensic implications. 


Wei et al., (2013) have conducted research on calibrated human Y- 
chromosomal phylogeny based on resequencing. They had identified variants 
present in high-coverage complete sequences of 36 diverse human Y 
chromosomes from Africa, Europe, South Asia, East Asia, and the Americas, 


representing eight major haplogroups. 


Perveen et al., (2014) studied Y-STR haplotype diversity in Punjabi 


population of Pakistan. 


Khurana et al., (2014) have analyzed Y Chromosome Haplogroup 
Distribution in Indo-European Speaking Tribes of Gujarat, Western India. The 
study was carried out in the Indo-European speaking tribal population groups 
of Southern Gujarat, India to investigate and reconstruct their paternal 
population structure and population histories. The role of language, ethnicity 
and geography in determining the observed pattern of Y haplogroup clustering 
in the study populations was also examined. A set of 48 bi-allelic markers on 
the non-recombining region of Y chromosome (NRY) were analyzed in 284 
males; representing nine Indo-European speaking tribal populations. The 
phylogenetic analysis revealed 13 paternal lineages, of which six haplogroups: 
C5, Hla*, H2, J2, Rlal* and R2 accounted for a major portion of the Y 
chromosome diversity. The higher frequency of the six haplogroups and the 
pattern of clustering in the populations indicated overlapping of haplogroups 


with West and Central Asian populations. 


55 


Literature Review 


After the discovery of DNA structure and successful completion of Human 
Genome Project, the major challenge for scientists is to decipher Human 
Genome Variation. Indian population is an amalgamation of various ethnic, 
cultural and geographical groups. DNA marker-based studies on Indian 
population have revealed the presence of large extent of genetic variation 
among various populations. Although data on various Indian populations have 
been reported, there are no published data available about the genetic structure 
of the Khandayat population of Odisha elucidating the haplotype diversity 
based on 17 Y-STR loci. Therefore, this research was designed to understand 
the genetic diversity of 17 Y-STR loci of Khandayats and compare them with 


Non-Khandayats of Odisha and random Indian population. 


56 


Methodology 


3.1 OBJECTIVES 
1. To study polymorphism of Y-STRs in the Khandayat and non- 
Khandayat population of Odisha and to find out haplotype diversity at 

17 Y-STR loci. 
2. To compare the haplotype diversity at 17 Y-STR loci of Khandayat 
and non-Khandayat population of Odisha and random Indian 


population via statistical analysis for genetic relatedness. 


3.2 HYPOTHESES 
1. Similarities in the haplotype diversity pattern of Y-STRs of the 
Khandayat and non-Khandayat population of Odisha may be observed. 
2. Khandayat population may be genetically related to some of the 


studied random Indian populations. 


DL 


Methodology 


3.3 SAMPLE ANALYSIS 


The samples were analyzed through the following steps. 


e Sample collection & preservation 
e DNA extraction 

e DNA Quantity & Quality check 
e Multiplex PCR 

e Genotyping 


e Statistical Analysis 


3.3.1 SAMPLE COLLECTION & PRESERVATION 


Whole blood samples (2ml) were collected using standard procedure in EDTA 


vacutainers (BD Biosciences, NJ, USA) from 300 healthy unrelated males of 


Odisha, India (150 samples from Khandayats and 150 samples from non- 


Khandayats) along with proper consent approved by Ethical Committee and 


stored at 4°C till further analysis. 


3.3.2 DNA EXTRACTION 


Genomic DNA was isolated from by standard Organic metod (Phenol - 


Chloroform extraction method) (Samrook ef al., 2001). 


DNA Extraction Protocol 


1. 


Add 1 ml of Lysis Buffer-I to Iml. of blood and mix properly 


2. Incubate at -80°C for 2hrs. 
3 
4 
5 


Shift the centrifuge tube to 65°C and keep it for 10 min. 


. Centrifuge for 15min at 4600rpm, 4°C. 
. Discard supernatant. Add |ml. Lysis Buffer-II to the pellet and mix 


properly. 


58 


Methodology 


6. Then add 2% SDS & Proteinase K. Mix properly by mild tapping. 

7. Incubate at 37°C overnight in water-bath. 

8. Cool to room temperature. Add Iml. of Phenol. Mix for 15minutes by 
inverting the tube. 

9. Centrifuge for 1Sminutes at 4600rpm. 

10. Discard organic phase. Shift the aqueous phase in another tube. 

11. Add 1Iml. of Phenol: Chloroform (1:1) to the aqueous phase. Mix for 
15minutes. 

12. Centrifuge for 1Sminutes at 4600 rpm. 

13. Discard organic phase. Shift the aqueous phase in another tube. 

14. Add Iml. of Chloroform: Iso-amyl alcohol (24:1) to the aqueous 
phase. Mix for 15minutes. 

15. Centrifuge for 1Sminutes at 4600rpm 

16. Discard organic phase. Shift the aqueous phase in another tube. 

17. To aqueous phase, add Iml. chilled Propanol & 0.1ml. of Sodium 
Acetate solution (3M). 

18. Precipitate the DNA by mixing properly. Centrifuge for 1min 
(Popspin). 

19. Wash DNA pellet by 70% ethanol. Centrifuge (Popspin). 

20. Dry the DNA pellet at room temperature & dissolve DNA in TE 
buffer. 


Reagents Used 

¢ Lysis buffer: I & II 

¢ Ethanol (70%) 

¢ Phenol (pH 8.0) 

¢ Chloroform: Isoamyl Alcohol (24:1) 
¢ Chilled Isopropanol 


59 


Methodology 


Sodium acetate solution (3 M) 
Proteinase K (20 mg/ml) 

20% (w/v) SDS 

TE (pH 7.6) 

Agarose 

10X TBE 

Ethidium Bromide 


Loading dye 


Functions of different reagents used in DNA extraction 


SDS: It’s a detergent. It helps in lysis of cells by removing lipid 
molecules and thereby causes disruption of cell membrane. 
Proteinase K: It breaks down peptides into smaller units and hence 
facilitates the removal of protein from the cell extract during 
treatment. 
Phenol: Chloroform: Isoamyl alcohol (25:24:1): Phenol and 
chloroform act as protein solvent and help in removal of protein. The 
organic phase contains protein and cell debris where as aqueous 
contains nucleic acids. Isoamyl alcohol reduces formation of froth 
during extraction process. 
Sodium Acetate & chilled propanol: These are used in precipitation of 
DNA. 
70% Ethanol: It is used to remove salts and contaminants. 
TE Buffer: DNA is preserved in TE Buffer. 
Lysis Buffer-I (pH 8.0) 

30 mM Tris-Cl 

5mM EDTA 


60 


Methodology 


50mM NaCl 

e Lysis Buffer-II (pH 8.0) 
2mM EDTA 
75mM NaCl 


Store the buffer at room temperature. 


3.3.3 DNA QUANTIFICATION 
The extracted DNA samples were quantified by spectrophotometer. 
e Optical density (OD) of DNA samples were collected at two different 
wavelengths (A1=260nm, 12=280nm). 
e Quantity of DNA (ug/ml) = OD at 260nm X 50 X dilution factor 
50 pg/ml = extinction co-efficient for DNA 
DNA QUALITY 
e Quality of DNA was checked by agarose gel electrophoresis (0.8% 
agarose gel) and verified with the help of GelDoc System. 


e Spectrophotometric analysis was also carried out. 


Agarose Gel Electrophoresis 


e For non-PCR products 0.8% agarose gel is used. 
0.8% agarose gel = 0.4g Agarose + 50ml TBE Buffer (1X) 
e Add 5ul of Ethidium Bromide after properly mixing agarose in the 
buffer. 
e Once the gel is cast, the comb is removed. 5u1 of each DNA sample is 
mixed with 5yul of loading dye & then loaded into well of the gel. After 
loading, the gel was run at 70Volt. for 15 minutes. 


e Then the gel was visualized under UV using Gel Doc. System. 


61 


Methodology 


3.3.4 MULTIPLEX PCR 

The DNA samples were amplified in Thermal Cycler (PTC 200, MJ Research 
Inc., US) using AmpF/ STR Yfiler PCR Amplification Kit™ (Applied 
Biosystems, Foster City, CA, USA) for 17 Y-STR loci simultaneously by 
Multiplex PCR as per the manufacturer’s instructions. The analyzed Y-STR 
loci include DYS 19, DYS 3891, DYS 389II, DYS 390, DYS 391, DYS 392, 
DYS 393, DYS 385a/b, DYS 437, DYS 438, DYS 439, DYS 448, DYS 456, 
DYS 458, DYS 635 and YGATAH4. 


Table 3.1: Yfiler Kit loci and alleles 


Locus Alleles included in AmpF/STR Yfiler | Dye 
designation | Allelic Ladder label 
DYS456 13, 14, 15, 16, 17, 18 

DYS389 I 10, 11, 12, 13, 14, 15 

DYS390 18, 19, 20, 21, 22, 23, 24, 25, 26, 27 OG: 
DYS389 II | 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34 FAM™ 
DYS458 14, 15, 16, 17, 18, 19, 20 

DYS19 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 


DYS385 a/b | 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, | VIC® 
20, 21, 22, 23, 24, 25 


DYS393 8,9, 10, 11, 12, 13, 14, 15, 16 
DYS391 7, 8,9, 10, 11, 12, 13 
DYS439 8,9, 10, 11, 12, 13, 14, 15 NED™ 


DYS635 20, 21, 22, 23, 24, 25, 26 
DYS392 7, 8,9, 10, 11, 12, 13, 14, 15, 16, 17, 18 
Y GATA H4 | 8, 9, 10, 11, 12, 13 


DYS437 13, 14, 15, 16, 17 
DYS438 8,9, 10, 11, 12, 13 PET® 
DYS448 17, 18, 19, 20, 21, 22, 23, 24 


62 


Methodology 


AmpF/ STR Yfiler fluorescent multi-color dye technology allows the analysis 
of multiple loci, including loci that have alleles with overlapping size ranges. 
Alleles for overlapping loci are distinguished by labeling locus-specific 
primers with different colored dyes. Multi-component analysis is the process 
that separates the five different fluorescent dye colors into distinct spectral 
components. The four dyes used in the Yfiler® Kit to label samples are 6- 
FAM™, VIC®, NED™, and PET® dyes. The fifth dye, LIZ®dye, is used to 
label the GeneScan™ 500 LIZ® Size Standard or the GeneScan™ 600 LIZ® 
Size Standard v2.0. Each of these fluorescent dyes emits its maximum 
fluorescence at a different wavelength. During data collection on the Life 
Technologies instruments, the fluorescence signals are separated by 
diffraction grating according to their wavelengths and projected onto a charge- 
coupled device (CCD) camera in a predictably spaced pattern. The 6-FAM™ 
dye emits at the shortest wavelength and it is displayed as blue, followed by 
the VIC® dye (green), NED™ dye (yellow), PET® dye (red), and LIZ® dye 


(orange). 


Although each of these dyes emits its maximum fluorescence at a different 
wavelength, there is some overlap in the emission spectra between the dyes 
(Figure 3). The goal of multi-component analysis is to correct for spectral 


overlap. 


63 


Methodology 


i 
ii 
a 
E 
E 
: 


Wavelengih (nm) 


Figure 3.1: Emission spectra of the five dyes used in the Yfiler Kit 


3.3.4.1 Preparation of PCR Master Mix 
1. Master mix was prepared. 
AmpFISTR® PCR Reaction Mix = 9.2 nL 
AmpFISTR® Yfiler® Primer Set = 5.0 wL 
AmpliTaq Gold® DNA Polymerase = 0.8 wL 
Total volume = 15 nL 
2. DNA samples were prepared. 
e Negative control - Add 10 wL TE buffer (10mM Tris, 0.1mM EDTA, 
pH 8.0). 
e Test sample - Dilute a portion of the test DNA sample with low-TE 
buffer so that 1.0 ng of total DNA is in a final volume of 10 wL. Add 
10 wL of the diluted sample to the reaction mix. 
e Positive control - Add 10 uL of control DNA (0.1 ng/L). 
3. The final reaction volume (sample or control plus master mix) is 25 wL. 
4. The sample mixture were amplified in Thermal cycler (under the conditions 


described in table 3.3.4.2). 


64 


Methodology 


3.3.4.2 Cycling conditions for Multiplex PCR 


Initial Denat | Anneal Extend | Final Final 
incubation step | ure extension | hold 
Hold 30 Cycles Hold Hold 
95°C, 11 min 94°C, | 61°C, 12°C: 60°C, 80 | 4°C 

1 min | Imin Imin min 00 


3.5 GENOTYPING 


1. 


Amplicons (PCR products) were analyzed on ABI Prism 3130x/ 
Automated Genetic Analyzer (Applied Biosystems, Foster City, CA, 
USA). Allelic designations for different loci were obtained by 
GeneMapper ID software (v. 3.2). 


. For genotyping 9nL of Hi-Di™ Formamide and size standard were 


prepared. 

GeneScan™ 600 LIZ® Size Standard = 0.5 pL 

Hi-Di™ Formamide 8.5 wL 
Into each well of a MicroAmp® Optical 96-Well Reaction Plate, 10uL 
of samples were added. 

9 uL of the formamide:size standard mixture 

1 wL of PCR product or allelic ladder 
The reaction plate was sealed with appropriate septa, then centrifuged 
to ensure that the contents of each well are collected at the bottom. 
The reaction plate was heated in a thermal cycler for 3 minutes at 95°C 
and then placed on ice for 3 minutes immediately. 
Plate assembly was prepared and placed on the autosampler. 


Electrophoresis run was started. 


65 


Methodology 


8. After electrophoresis, the data collection software stores information 
for each sample in a .fsa file. Analysis of allelic designations for 
different loci and interpretation of results were obtained using 


GeneMapper® JD Software (v3.2) 


3.6 STATISTICAL ANALYSIS 
Allele frequencies were calculated by direct counting. Gene diversity (GD) 


was calculated using the formula (Nei, 1973, 1974): 
n n 
GD =——|1-) | P° 
n-1 2, 
Where, P; is the frequency of ith allele and 1 is number of samples analyzed. 


Haplotype diversity (HD) was calculated as: 


n oe 
HD =——|1- ) X: 
ad 


Where, X; represents haplotype frequency. 


Standard errors for HD were calculated according to the following equation 


(Nei & Kumar, 2000) 


se 2Sxr-(Ex0) | 


Discrimination capacity (DC) was determined by formula Nzp/N, where Nup 


is number of haplotypes observed. 


66 


Methodology 


3.6.1 AMOVA 

For analysis of molecular variance (AMOVA), online AMOVA tool provided 
by YHRD (Rower et al., 1996; Rower et al., 2001) was used. A total of 11 
population samples with 840 haplotypes were included in this study; 
Jharkhand Sakaldwipi Brahmin population, Karnataka Brahmin population, 
Kashmir Saraswat Brahmin population, Maharashtra Mahadev_ Koli 
population, Punjab Balmiki population, Rajasthan Saraswat Brahmin 
population, Tamil population, Tamil Nadu Iyengar population, Tripuri 
population, West Bengal Rajbanshi population and Odisha Khandayat 


population. 


3.6.2 MDS plot 

Population pairwise distances between Khandayat population of Odisha and 
other Indian populations (Rst values) were calculated. Graphical 
representations of genetic distances between populations were obtained by 
multidimensional scaling analysis (MDS plot). MDS plots were constructed 


based on genetic distances (Nei & Roychoudhury, 1974). 


3.6.3 Dendogram 
Dendogram was constructed using DendroUPGMA software program 


(www.genomes.urv.cat/UPGMA). This program calculates a_ similarity 


coefficient between pairs of sets of variables and transforms these coefficients 
into distances and makes a clustering using the Unweighted Pair Group 
Method with Arithmetic mean (UPGMA) algorithm. Dendogram was 


constructed using Rst values. 


67 


Results & Discussion 


300 whole blood samples were collected from healthy unrelated males of 
Odisha (150 samples from Khandayats and 150 samples from non- 
Khandayats) for analysis of haplotype diversity at 17 Y-STR loci. 


Khandayat Population 

Among the Khandayat population, 146 different haplotypes were observed. 
The observed haplotype details are discussed in Supplementary Table 1. One 
hundred and forty three (143) haplotypes were unique (97.9452%), which 
were observed only once. Two haplotypes were observed twice (1.36%) and 


only one haplotype was observed thrice (0.6849%). 


Observed alleles among the Khandayats for the 17 Y-STR loci along with 
their allelic frequencies have been mentioned in Table 4.1. The total number 
of alleles observed in this population was found to be 106 and the mean allele 
number per locus was 6.235. Maximum number of alleles was observed at the 
bi-allelic marker DYS385a/b with 33 alleles followed by locus DYS635 with 
7 alleles. Allele frequencies of Khandayats of Odisha varied from 0.0074 to 
0.7426. 


Gene diversity (GD) per locus ranged from 0.4223 to 0.9609 with an average 
GD value of 0.6892. The lowest gene diversity (0.4223) has been found at 
locus DYS391, wherein the most frequent allele has been allele 10 with a 
frequency of 74.26%. The highest gene diversity (0.9609) has been found in 
case of the bi-allelic marker DYS385a/b. Haplotype diversity (HD) value for 
the Khandayat population of Odisha was found to be 0.999128. 
Discrimination capacity (DC) value for the studied samples was calculated to 


be 0.97333. 


68 


Results & Discussion 


ST1660 0 = CH 0 Ansmaig adtiopdegy 
ane), Atemeald aD = G5 


L+100 +141 W770 sexeo it 
Itw00 IT EI OStTO ST 
$1s00 Orel POt0'0 st 
60800 6r el TI6T 0 eT 
38500 ST El S9LTO & 
£200 LT€l 27100 ST wt 
$000 STEI L+100 StTO It 
P6700 OcTI (70 oT 
L¢100 6ITI FsT0'0 FSTs0 61 
I+#00 sIT1 SOTTO 8I9TO SI 
8sS00 {TTI KETO SeL400 FL4000 FL000 iT 
L000 ITH Tréz® o0Ero £Olro sgs00 SI 
S9E00 orl SEStO soso TSIEO ESeTO SI 
T9900 6ITl remo «SISO S8ss0 sr00 Trr00 aLTTO L£79T0 BESTO cal 
Poco0 stl TEIz0 $1000 FL000 f0LTO £98€0 OSTIO qrord 9.9F0 cai 
s9c00 éT1l 65370 FisT0 L.6e0 IFFSO Ito Le9TO STO tr 
L000 STI Izit0 OSFEO OSFED SISTO TO90O +FSTO FL000 Il 
$1000 OC Ol ssso0 GUSTO OSLE0 FL000 SOLO STFLO oI 
£000 6101 00st 0 s8s0°0 6 
$2000 S101 $6t00 $ 
VESst rH seo 68st) CIFCLECEKHCCCSEH_CC‘éESSE we I6e Ose Hse see 


adGoms) YIYDA Sid Sid © SAG) SAG SAG SAG) SAG = SISAd 


69 


Table 4.1: Allele frequency and gene diversity values of 17 Y-STR loci in the 


Khandayat population of Odisha, India. 


Results & Discussion 


Non-Khandayat Population 

Among the non-Khandayat population, 143 different haplotypes were 
observed. The observed haplotype details are discussed in Supplementary 
Table 2. 138 haplotypes were unique (96.5035%), which were observed only 
once. Three haplotypes were observed twice (2.0979%) and two haplotypes 
were observed thrice (1.3986%). 


Observed alleles among the non-Khandayats for the 17 Y-STR loci along with 
their allelic frequencies have been mentioned in Table 4.2. The total number 
of alleles observed in this population was found to be 112 and the mean allele 
number per locus was 6.588. Maximum number of alleles was observed at the 
bi-allelic marker DYS385a/b with 39 alleles followed by locus DYS635 with 
7 alleles. Allele frequencies of Khandayats of Odisha varied from 0.0088 to 
0.7719. 


Gene diversity (GD) per locus ranged from 0.3802 to 0.9635 with an average 
GD value of 0.6835. The lowest gene diversity (0.3802) has been found at 
locus DYS391, wherein the most frequent allele has been allele 10 with a 
frequency of 74.26%. The highest gene diversity (0.9635) has been found in 
case of the bi-allelic marker DYS385a/b. 


Haplotype diversity (HD) value for the Khandayat population of Odisha was 


found to be 0.99891321. Discrimination capacity (DC) value for the studied 
samples was calculated to be 0.95333. 


70 


Results & Discussion 


£97000'0 = LZCLER66 0 = Arana adiGoideH 
ane, (sand 3u25 G9 
“SBR ONY JO 
UOPEUNWICO JO) DS}ENOIED auam QSSESAC 30s S2NUaNbay ad(jousd ay Deuodas ave sa}ouenbay edijousd uo 20) VESEESAC pedKs N00] WLS-A UNeS JO) SemUSNDaY afale SMOUS SHEL, 
segs'o €ZrI0 e910 Pred 18950 E9090 ees Gors0 SEND 11990 O70 mee D Cri poeZO eOs9D feo 6060S 
sudo 86 'e 
seov = oe 
SHO0 86 b'S8 
SET = «ZS 
goog = aes 
sseoo BASS 
eozyo 80 B'S 
gzsyo 0s Zt 
s2000 = O's 
sero «Gk ser00 
eszoo | Gh FS 29310 vese'0 oe 
ct ml at zesz0 porto &z 
o300'0 «Ohi Orii0 pSLto 4 
so00'0 «srk Fs 9sz0 asico z 
eszso 83 tz'eh g1e10 oz 
gzsoo 0s ees sseo0 st 
zoo = BLES 2991°0 9 
visTD «= SLES 6zS1°0 £z 
zwaoo Os k's s2i00 ozo Zz 
e200'0 «tes $2100 zeoz0 iz 
eso | OZ 796z'0 oz 
esov0 = Gt'Zs 1Se0'0 esrs'0 él 
sev | Bh'Z Tecan] orilo et 
szsvo 0 t'Zh rrszD «6 PISND §6sa000 es00'0 ub 
e200 «zh clézD = ozor0 soto zouo0 ob 
servo =e eLOzD OvISSO Spze'o 1ezz0 st 
ZOO |G iseo'D = ri9d0 szss'o | GEPOO szsvo0 6 f6iz0 rrSTD Oo vt 
seo 60 aS 29980 98000 esovo 6 esord oseso 86 ote crsr'D | osee"0 ch 
ssego | zk'Ls = gp0SO zeszo ZL20 «PISO 6 SO00D 6iZD sezt'0 zt 
9000 «ENA ETO lzrco 8 696ST0 «SLID «6GtPOO) «CLSrI'0 29000 ul 
se0v0 «eo Cs STO osrza G0SED S8000 serDd cLez0 ol 
e200'0 «= «BL rrsz 0 70200 4 


gsov0 —Bk Ok tseo0 E] 


orr icy Cis] eer eet zee tet ose nese 1696 


os7 
SAG 


Table 4.2: Allele frequency and gene diversity values of 17 Y-STR loci in the 


non-Khandayat population of Odisha, India. 


ral 


Results & Discussion 


DYS19 


0.4500 


0.4000 


0.3500 


0.3000 


0.2500 


Frequency 


0.2000 


0.1500 

0.1000 

0.0500 

0.0000 
@ Non-Khandayat 0.1228 0.3860 0.1930 0.2281 0.0702 
@ Khandayat 0.1544 0.3676 0.1838 0.2353 0.0588 


Figure 4.1: Allelic frequency distribution for locus DYS19 among Khandayat 


and non-Khandayat population samples 


Allelic frequency for locus DYS19 among the Khandayats varied from 0.0588 
to 0.3676. Among the non-Khandayats, allelic frequency for this locus ranged 
from 0.0702 to 0.3860. 


Results & Discussion 


DYS3891 


0.5000 


0.4500 


0.4000 


0.3500 


0.3000 


0.2500 


Frequency 


0.2000 


0.1500 


0.1000 


0.0500 


0.0000 


G Non-Khandayat 0.0088 
i Khandayat 0.0074 


Allele 


Figure 4.2: Allelic frequency distribution for locus DYS389I among 
Khandayat and non-Khandayat population samples 


Allelic frequency for locus DYS389I among the Khandayats varied from 
0.0074 to 0.4632. Among the non-Khandayats, allelic frequency for this locus 
ranged from 0.0088 to 0.4649. 


Results & Discussion 


DYS389ll 


0.4000 


0.3500 


0.3000 


0.2500 


0.2000 


Frequency 


0.1500 


0.1000 


0.0500 


0.0000 


(@ Non-Khandayat 
i Khandayat 


Allele 


Figure 4.3: Allelic frequency distribution for locus DYS389II among 
Khandayat and non-Khandayat population samples 


Allelic frequency for locus DYS389II among the Khandayats varied from 
0.1471 to 0.3529. Among the non-Khandayats, allelic frequency for this locus 
ranged from 0.1404 to 0.3684. 


Results & Discussion 


DYS390 


0.4500 


0.4000 


0.3500 


0.3000 


0.2500 


0.2000 


Frequency 


0.1500 


0.1000 


0.0500 


0.0000 


(@ Non-Khandayat 
i Khandayat 


Allele 


Figure 4.4: Allelic frequency distribution for locus DYS390 among 
Khandayat and non-Khandayat population samples 


Allelic frequency for locus DYS390 among the Khandayats varied from 
0.1765 to 0.3897. Among the non-Khandayats, allelic frequency for this locus 
ranged from 0.1579 to 0.4123. 


Results & Discussion 


DYS391 


0.9000 


0.8000 


0.7000 


0.6000 


0.5000 


0.4000 


Frequency 


0.3000 


0.2000 


0.1000 


0.0000 


G Non-Khandayat 0.0702 
i Khandayat 0.0588 


Allele 


Figure 4.5: Allelic frequency distribution for locus DYS391 among 
Khandayat and non-Khandayat population samples 


Allelic frequency for locus DYS391 among the Khandayats varied from 
0.0441 to 0.7426. Among the non-Khandayats, allelic frequency for this locus 
ranged from 0.0088 to 0.7719. 


Results & Discussion 


DYS392 


0.6000 


0.5000 


0.4000 


0.3000 


Frequency 


0.2000 


0.1000 


0.0000 


@ Non-Khandayat 0.0439 0.0439 
@ Khandayat 0.0368 0.0662 


Figure 4.6: Allelic frequency distribution for locus DYS392 among 
Khandayat and non-Khandayat population samples 


Allelic frequency for locus DYS392 among the Khandayats varied from 
0.0368 to 0.5441. Among the non-Khandayats, allelic frequency for this locus 
ranged from 0.0439 to 0.5614. 


Results & Discussion 


DYS393 


0.4500 


0.4000 


0.3500 


0.3000 


0.2500 


Frequency 


0.2000 


0.1500 


0.1000 


0.0500 


0.0000 


G Non-Khandayat 0.0088 
i Khandayat 0.0074 


Figure 4.7: Allelic frequency distribution for locus DYS393 among 
Khandayat and non-Khandayat population samples 


Allelic frequency for locus DYS393 among the Khandayats varied from 
0.0074 to 0.3971. Among the non-Khandayats, allelic frequency for this locus 
ranged from 0.0088 to 0.3860. 


Results & Discussion 


DYS438 


0.4000 


0.3500 


0.3000 


0.2500 


0.2000 


Frequency 


0.1500 


0.1000 


0.0500 


0.0000 


G Non-Khandayat 0.0351 
i Khandayat 0.0294 


Allele 


Figure 4.8: Allelic frequency distribution for locus DYS438 among 
Khandayat and non-Khandayat population samples 


Allelic frequency for locus DYS438 among the Khandayats varied from 
0.0294 to 0.3750. Among the non-Khandayats, allelic frequency for this locus 
ranged from 0.0351 to 0.3596. 


Results & Discussion 


YGATAH4 


0.6000 


0.5000 


0.4000 


0.3000 


Frequency 


0.2000 


0.1000 


0.0000 


G Non-Khandayat 0.0526 
i Khandayat 0.0588 


Allele 


Figure 4.9: Allelic frequency distribution for locus YGATAH4 among 
Khandayat and non-Khandayat population samples 


Allelic frequency for locus YGATAH4 among the Khandayats varied from 
0.0588 to 0.4559. Among the non-Khandayats, allelic frequency for this locus 
ranged from 0.0526 to 0.5088. 


Results & Discussion 


DYS439 


0.4000 


0.3500 


0.3000 


0.2500 


0.2000 


Frequency 


0.1500 


0.1000 


0.0500 


0.0000 


[@ Non-Khandayat 0.2456 0.3421 0.2632 0.1053 0.0439 
@ Khandayat 0.2500 0.3456 0.2574 0.1103 0.0368 
Allele 


Figure 4.10: Allelic frequency distribution for locus DYS439 among 
Khandayat and non-Khandayat population samples 


Allelic frequency for locus DYS439 among the Khandayats varied from 
0.0368 to 0.3456. Among the non-Khandayats, allelic frequency for this locus 
ranged from 0.0439 to 0.3421. 


Results & Discussion 


DYS437 


0.6000 


0.5000 


0.4000 


0.3000 


Frequency 


0.2000 


0.1000 


0.0000 
13 14 15 16 17 


G Non-Khandayat 0.0088 0.5526 0.3246 0.1053 0.0088 


@ Khandayat 0.0074 0.5588 0.3162 0.1103 0.0074 
Allele 


Figure 4.11: Allelic frequency distribution for locus DYS437 among 
Khandayat and non-Khandayat population samples 


Allelic frequency for locus DYS437 among the Khandayats varied from 
0.0074 to 0.5588. Among the non-Khandayats, allelic frequency for this locus 
ranged from 0.0088 to 0.5526. 


Results & Discussion 


DYS448 


0.6000 


0.5000 


0.4000 


0.3000 


Frequency 


0.2000 


0.1000 


0.0000 


17 18 22 
@ Non-Khandayat 0.0088 0.1140 0.0175 
i Khandayat 0.0074 0.1618 0.0147 


Allele 


Figure 4.12: Allelic frequency distribution for locus DYS448 among 
Khandayat and non-Khandayat population samples 


Allelic frequency for locus DYS448 among the Khandayats varied from 
0.0074 to 0.5294. Among the non-Khandayats, allelic frequency for this locus 
ranged from 0.0088 to 0.5439. 


Results & Discussion 


DYS456 


0.6000 


0.5000 


0.4000 


0.3000 


Frequency 


0.2000 


0.1000 


0.0000 


13 14 
@ Non-Khandayat 0.0088 0.0614 
@ Khandayat 0.0074 0.0515 


Figure 4.13: Allelic frequency distribution for locus DYS456 among 
Khandayat and non-Khandayat population samples 


Allelic frequency for locus DYS456 among the Khandayats varied from 
0.0074 to 0.5368. Among the non-Khandayats, allelic frequency for this locus 
ranged from 0.0088 to 0.5614. 


Results & Discussion 


DYS458 


0.3500 


0.3000 


0.2500 


0.2000 


Frequency 


0.1500 


0.1000 


0.0500 


0.0000 


14 15 19 
(G@ Non-Khandayat 0.0351 0.2018 0.0351 
@ Khandayat 0.0294 0.1838 0.0294 


Allele 


Figure 4.14: Allelic frequency distribution for locus DYS458 among 
Khandayat and non-Khandayat population samples 


Allelic frequency for locus DYS458 among the Khandayats varied from 
0.0294 to 0.2941. Among the non-Khandayats, allelic frequency for this locus 
ranged from 0.0351 to 0.3719. 


Results & Discussion 


DYS635 
0.3000 
0.2500 
0.2000 
> 
2) 
S 
3 0.1500 
> 
2 
re 
0.1000 
0.0500 
0.0000 
19 20 21 22 23 24 25 
@ Non-Khandayat 0.0351 0.1316 0.2456 0.1140 0.2632 0.1667 0.0439 
@ Khandayat 0.0294 0.1250 0.2721 0.1324 0.2500 0.1544 0.0368 


Figure 4.15: Allelic frequency distribution for locus DYS635 among 
Khandayat and non-Khandayat population samples 


Allelic frequency for locus DYS635 among the Khandayats varied from 
0.0294 to 0.2721. Among the non-Khandayats, allelic frequency for this locus 
ranged from 0.0351 to 0.2632. 


Results & Discussion 


DYS385a/b 


0.0900 


0.0800 


0.0700 


0.0600 


0.0500 


Frequency 


0.0400 


0.0300 


0.0200 + 


0.0100 + 


0.0000 + 


10,1] 10,1] 10,2]11,1) 11,1] 11,1) 11,1 ft 12] 11,2]12,1] 12,1) 12,1 )12,2]13,1]13,1]13,1] 13,1 ]13,2]13,2] 14,1] 14.1 ]14,1]14,1] 14,1 ]14,2]15,1]15,1]15,1]15,2]15,2]16,1]16,2] 18,1 
s|}olole}7}sj/os;olifl7]s]osofleoe}7}s]ofofi}4a}ol7}sfo]lo}7}/s}o}fo]il7foj}es 


ID Non-Khandayat |0.00]0.00/0.00|0.00/0.03|0.03 |0.07|0.04]0.00}0.05]0.03|0.00| 0.02 |0.00|0.07|0.06| 0.07 |0.05 0.02] 0.00] 0.00}0.06|0.02/ 0.04 |0.00}0.05]0.02] 0.03 }0.00]0.03}0.01]0.00}0.01 


@ Khandayat 0.00] 0.00} 0.00 }0.00/0.03 | 0.02 |0.06]0.03|0.00] 0.05]0.04}0.01]0.02|0.00)0.07]0.05)0.08]0.05 |0.02] 0.01]0.00}0.05]0.02] 0.03 0.00 ]0.05)0.02|0.020.01]0.03|0.01 |0.00]0.01 
Allele 


Figure 4.16: Allelic frequency distribution for locus DYS385a/b among 
Khandayat and non-Khandayat population samples 


Allelic frequency for locus DYS385a/b among the Khandayats varied from 


0.0074 to 0.0809. Among the non-Khandayats, allelic frequency for this locus 
ranged from 0.0088 to 0.0702. 


87 


Results & Discussion 


Gene Diversity (GD) 


1.2000 


1.0000 


0.8000 


Gene Diversity Value 
Oo Oo 
BR a 
So So 
3S 3S 
Oo Oo 
| 
SS a ee es 


0.0000 | 


pysi9 DYS38 |DYS38|DYS39|/DYS39|DYS39|DYS39 |DYS43 |DYS43| DYS43|DYS44|DYS45|DYS45|DYS63| YGAT |DYS38 


gl Ql 0 1 2 3 8 9 f 8 6 8 5 AH4 | Sa/b 
@ Non-Khandayat | 0.7483] 0.6508 |0.7204 | 0.7143 | 0.3802 | 0.6210] 0.6811 |0.6876 |0.7469/0.5831 |0.6069| 0.5881 | 0.7844] 0.8163 |0.6423 |0.9635 
@ Khandayat 0.7539] 0.6500 | 0.7233 |0.7269|0.4223)0.6353 | 0.6672 |0.6816 | 0.7438] 0.5797 | 0.6236 | 0.5987 | 0.7761 |0.8102|0.6742|0.9609 
Y-STR Loci 


Figure 4.17: Gene Diversity (GD) value of various Y-STR loci among 
Khandayat and non-Khandayat population samples 


Gene Diversity for various Y-STR loci among the Khandayats varied from 


0.4223 to 0.9609. Among the non-Khandayats, gene diversity value for 
various Y-STR loci ranged from 0.3802 to 0.9635. 


88 


Results & Discussion 


For extensive analysis of the genetic relatedness, haplotypes of the Khandayat 
population of Odisha were compared via AMOVA with haplotypes of other 
populations of India. The details of Indian populations used for comparative 
analysis are discussed in Table 4.3. Analysis of molecular variance pairwise 
distances based on Rst values between the Khandayats and other Indian 
populations are described in Table4.4. Results revealed that Khandayat 


population is not closely related to other Indian populations. 


Table 4.3: Details of studied Indian populations 


Sl. no. Population name Location No. of haplotypes 
1 Khandayat Odisha 146 
2 Sakaldwipi Brahmin Jharkhand 65 
3 Brahmin Karnataka 103 
4 Saraswat Brahmin Kashmir 58 
z) Mahadev Koli Maharashtra 65 
6 Balmiki Punjab 62 
q Saraswat Brahmin Rajasthan 60 
8 Tamil Southern India 126 
9 Iyengar Tamil Nadu 67 
10 Tripuri Tripura 65 
11 Rajbanshi West Bengal 39 


89 


Results & Discussion 


Table 4.4: AMOVA pairwise distances based on Rst values between the 


Khandayat population of Odisha and other Indian populations. 


Population 
JhBr 
KBr 
KSBr 
MK 
PB 
RSBr 
STm 
TNI 
TrI 
WBRj 


ODKh 


JhBr 


0.2362 


0.2048 


0.1422 


*JhBr-Jharkhand Sakaldwipi Brahmin, RSBr-Rajasthan Saraswat Brahmin, KBr-Karnataka 


Brahmin, STm-Southern India Tamil, TNI-Tamil Nadu Iyengar, KSBr- Kashmir Saraswat 
Brahmin, MK-Maharashtra Mahadev Koli, PB-Punjab Balmiki, Trl-Tripura Tripuri, WBRj- 
West Bengal Rajbanshi, OdKh-Odisha Khandayat. 


90 


Results & Discussion 


Multi dimensional scaling plot (MDS plot) based on pairwise genetic 
distances or Rst values between Khandayat population of Odisha and other 


Indian population was constructed (Figure 4.18). 


MDS 


Dimension 2 
0.10 0.15 0.20 0.25 


0.05 


0.00 


-0.10 -0.05 


-0.10 -0.05 0.00 0.05 0.10 0.15 0.20 0.25 


Dimension 1 
stress = 0.0366 


Figure 4.18: MDS Plot for Indian populations 


91 


Results & Discussion 


JhBr 
RSBr 
KBr 
STm 
TNI 
KSBr 
MK 
PB 
Trl 
WBRj 
ODKh 


Figure 4.19: Neighbor Joining Tree showing relationship between Khandayat 


population and other Indian populations 


*JhBr-Jharkhand Sakaldwipi Brahmin, RSBr-Rajasthan Saraswat Brahmin, KBr-Karnataka 
Brahmin, STm-Southern India Tamil, TNI-Tamil Nadu Iyengar, KSBr- Kashmir Saraswat 
Brahmin, MK-Maharashtra Mahadev Koli, PB-Punjab Balmiki, Trl-Tripura Tripuri, WBRj- 
West Bengal Rajbanshi, OdKh-Odisha Khandayat. 


92 


Results & Discussion 


KSBr 


eS 


Figure 4.20: Dendogram showing relationship between Khandayat population 


and other Indian populations 


*JhBr-Jharkhand Sakaldwipi Brahmin, RSBr-Rajasthan Saraswat Brahmin, KBr-Karnataka 
Brahmin, STm-Southern India Tamil, TNI-Tamil Nadu Iyengar, KSBr- Kashmir Saraswat 
Brahmin, MK-Maharashtra Mahadev Koli, PB-Punjab Balmiki, Trl-Tripura Tripuri, WBRj- 
West Bengal Rajbanshi, OdKh-Odisha Khandayat. 


93 


Results & Discussion 


To study genetic relatedness between Khandayats and global populations, 


MDS plot was constructed using Rst values (Figure 4.21). 


Dimension 2 


MDS 


olnda 


oBeling, China Fan] 


Sas Country, Span pagers iraen] 
Chaka, 


© South Kazakhstan, Kazakhstan [Kazakh] 


9 Crata (Croatian 
2 Kathmandu, Nepal [Nepalese] 


-1.0 -0.5 0.0 0.5 1.0 


Dimension 1 
stress = 0.00391 


Figure 4.21: MDS Plot for global populations 


94 


Results & Discussion 


Dimension 2 


MDS 


0.0 0.5 1.0 1.5 2.0 2.5 


Dimension 1 
stress = 0.0129 


Figure 4.22: MDS Plot for global populations 


95 


Results & Discussion 


Population genetics studies have revealed large extent of genetic diversity 
among different populations of India (Sahoo et al., 2006; Barik et al., 2008; 
Mukherjee et al., 2009; Yadav et al., 2011; Khurana et al., 2014). The present 
study was carried out with an aim to study the haplotypes diversity at 17 Y- 
STR loci of Khandayat population of Odisha and compare the same with other 
Indian populations in order to find out the genetic relationship between 
different populations. Although data on various Indian populations have been 
reported, there are no published data available about the genetic structure of 
the Khandayat population of Odisha elucidating the haplotype diversity based 
on 17 Y-STR loci. Majority of haplotypes obtained in this study are unique. 
Among the 17 Y-STR loci analyzed, the highest gene diversity (0.9609) was 
observed for locus DYS 385a/b and the lowest gene diversity (0.4223) was 
observed in case of locus DYS 391, which is in accordance with one of the 


earlier findings in South Indian population data (Balamurugan et al., 2010). 


For extensive analysis of the genetic relatedness, haplotypes of Khandayat 
population of Odisha were compared via AMOVA with Jharkhand, India 
(Sakaldwipi Brahmin) population sample with 65 haplotypes, Karnataka, 
India (Brahmin) population sample with 103 haplotypes, Kashmir, India 
(Saraswat Brahmin) population sample with 58 haplotypes, Maharashtra, 
India (Mahadev Koli) population sample with 65 haplotypes, Punjab, India 
(Balmiki) population sample with 62 haplotypes, Rajasthan, India (Saraswat 
Brahmin) population sample with 60 haplotypes, Southern India, India 
(Tamil) population sample with 126 haplotypes, Tamil Nadu, India (Iyengar) 
population sample with 67 haplotypes, Tripura, India (Tripuri) population 
sample with 65 haplotypes, West Bengal, India (Rajbanshi) population sample 
with 39 haplotypes. 


96 


Results & Discussion 


Comparative analysis revealed that pairwise genetic distance values ranged 
from 0.0012 to 0.3863. Observations from multi dimensional scaling plot 
(MDS plot) base on Rst values revealed that Khandayat population of Odisha 
is significantly different from other Indian populations. Results as illustrated 
with the MDS plot, shows high level of heterogeneity between Indian 
populations. Among the various Indian populations, Khandayat population 
show closer similarity with Rajbanshi population (West Bengal, India) and 
Tripuri population (Tripura, India). The Neighbor Joining Tree and 


dendogram also reveal the same kind of results. 


The haplotypes of the Khandayats were compared with the haplotypes of 
various global populations. The pairwise difference analysis results were 
0.0128 for Australia [Aboriginal] 0.0681 for the Taiwan population, 0.1023 
for the Afghanistan [Afghan], and 0.1785 for Germans. These values show 
that Khandayat population is distant from other European populations and 
close to the Australia [Aboriginal] and Turkish population (Figure 4.21 & 
Figure 4.22). 


Haplotype diversity and discrimination capacity for the studied Khandayat 
population were found to be 0.999128 and 0.95588 respectively, which imply 
that the 17 Y-STR loci studied in the Khandayat population are highly 
polymorphic. A higher degree of haplotype diversity and discrimination 
capacity indicates that 17 Y-STR loci used in the current study are highly 
polymorphic among the Khandayat community. Thus, this set of Y-STRs can 
be used for the forensic purposes like paternity testing, individual 
identification, genetic mapping etc. and this will add to the databank of 
various studies conducted on Indian population as no previous Y-STR data are 


available in the literature for this population. 


97 


Conclusion 


DNA typing for forensic identification is a two step process. The first step 
involves developing the profiles from samples collected at the crime scene 
and comparing them with the profiles obtained from suspects and the 
victims. In the case of a match that includes the suspect as potential source 
of the sample collected at the crime scene, the last step in the process is to 
answer the question, what is the likelihood that someone in addition to the 
suspect could match the profile of the analyzed sample? This likelihood is 
calculated by determining the frequency of the suspect’s profile in the 
relevant population databases. The issue becomes more relevant in the 
case of discrete polymorphic markers that show higher probability of 
occurrence in the reference population, where several orders of magnitude 
difference between the databases may have an impact on the jury. This 
necessitates development of reference database for different population for 


forensic purposes. 


India is known for its vast human diversity, consisting of more than four 
and a half thousand anthropologically well-defined populations. Each 
population differs in terms of language, culture, physical features and, 
most importantly, genetic architecture. There has been tremendous interest 
among historians, archaeologists, anthropologists, linguists and geneticists 
to understand the unique structure of Indian populations and their affinities 
with the rest of the world. Most importantly, researchers working on 
various diseases often find that disease-causing genetic variations are 
different in Indian populations. During the last two decades, many exciting 
observations have been made regarding Indian people by several 
investigators; however, these findings have remained scattered. During the 
past two decades, we have witnessed remarkable advancements in 


technology. We have advanced from low resolution genetic markers to 


98 


Conclusion 


high throughput whole genome sequencing. Despite these advancements, 


studies using high density markers were lacking in the Indian scenario. 


Therefore, an attempt was made to extensively study Odisha populations 
using 17 Y-STR markers. Higher degrees of haplotype diversity and 
discrimination capacity indicate that 17 Y-STR loci used in the current 
study are highly polymorphic among the Khandayat community. Thus, 
this set of Y-STRs can be used for the forensic purposes like paternity 
testing, individual identification, genetic mapping etc. and this will add to 
the databank of various studies conducted on Indian population as no 
previous Y-STR data are available in the literature for this population. 
Comparative analysis of Khandayat population with other Indian 
populations revealed that, this population is highly endogamous and there 


is little genetic influence from other populations. 


This study laid out a plethora of information on Khandayat population of 
Odisha and the data presented in this study would aid to future 
comparisons of different population genetics research based on Y-STR 
markers. Future research work can be carried out using newly available Y- 


STR markers out in order to study the diversity among the Khandayats. 


99 


Appendices 


Supplementary Table 1: Haplotypes of Khandayat Population 


g _ 4 =e 
= Dn Nn Dn Nn Dn DN Dn Dn Nn nN nN nN Nn nN Nn < ia 
= SA & & fe BR & 6S Se ae ES ¢ 
Khl 15 13 27 22 10 12 13 13,19 11 10 16 19 16 16 24 12 1 
Kh2 13 13 27 22 10 14 11 13,17 10 14 15 18 17 15 22 12 1 
Kh3 13 12 28 24 10 13 12 14,16 10 11 14 18 16 18 22 10 1 
Kh4 15 13 29 21 11 11 13 11,17 11 10 14. 20 15 16 23 13 1 
Kh5 13 13 27 22 12 14 12 11,16 10 11 16 19 16 16 23 11 1 
Kh6 13 13 28 22 10 14 11 13,19 10 12 15 19 16 16 22 12 1 
Kh7 15 13 29 21 10 12 13 11,19 11 10 14. 20 15 15 23 12 1 
Kh8 13 14 30 23 10 12 12 12,17 10 10 14 19 15 18 21 11 1 
Kh9 14 13 30 23 10 13 12 12,17 10 10 14 19 15 19 21 11 1 
Kh10 15 12 29 21 10 12 12 15,17 9 11 14 19 15 17 21 12 
Khl1 2 3 27 22 0 2 2 14,14 9 11 14 19 15 16 20 12 
Kh12 2 2 27 24 0 4 2 13,17 9 12 15 20 16 17 22 11 
Kh13 3 2 30024 1 2 3 12,18 11 10 14 20 16 16 23 13 
Khl4 3 4 30 «22 0 1 2 15,17 9 11 14 19 17 16 21 12 
Kh15 2 3 29 «224 0 2 2 12,20 10 12 14 19 15 18 21 12 
Khl6 3 2 2] 22 0 4 1 13,20 10 12 15 19 15 16 21 12 
Khl17 3 3 30 «24 0 2 3 15,21 10 12 14 19 15 16 20 11 
Kh18 3 3 27) 22 0 2 3 12,20 10 11 15 19 16 17 21 11 
Khl19 3 3 27 22 0 1 2 15,19 9 11 14 19 15 17 21 11 
Kh20 4 3 29 23 0 2 3 13,18 1 13 16 18 17 18 24 10 
Kh21 4 3 27 22 0 1 3 12,17 1 11 15 19 15 17 23 10 
Kh22 3 2 28 23 0 2 3 13,19 1 11 16 20 15 18 24 12 
Kh23 3 3 29 «21 1 2 3 11,17 1 10 14 20 16 16 23 12 
Kh24 3 2 28 22 0 4 1 13,21 10 12 15 19 15 14 24 12 
Kh25 3 3 27 22 0 2 2 15,17 9 12 14 19 16 17 22 12 
Kh26 3 3 27 24 0 4 3 13,18 1 11 16 20 15 16 21 11 
Kh27 4 4 29. 22 0 2 2 15,18 9 12 14 19 16 18 21 12 
Kh28 3 30 «22 0 0 2 13,18 1 14 16 19 15 18 25 12 
Kh29 4 3 30.021 0 2 3 10,19 1 10 14 20 15 17 23 12 
Kh30 4 4 300-23 1 3 2 16,17 9 1 14 19 15 16 20 12 
Kh31 5 3 27; 22 0 2 2 15,21 9 12 14 20 16 18 19 12 
Kh32 6 3 30 «24 0 2 3 13,20 1 10 14 20 15 16 23 13 
Kh33 2 4 29° «23 0 3 4 12,18 J 1 17 19 17 18 25 12 
Kh34 3 4 30.)—s 21 0 2 3 11,20 1 10 14 20 15 16 23 13 
Kh35 4 4 30.021 0 2 3 11,19 1 10 14 20 15 17 21 13 
Kh36 5 3 30 «22 0 2 2 14,19 1 1 15 19 15 19 21 11 
Kh37 4 3 27 23 0 2 4 13,20 9 L 14 21 14 17 24 11 


133 


Appendices 


12 
13 


19 14 18 21 
15 15 


20 


9 i 14 
14 


11 


30.24 +10 «+12 «213 ~«13,21 
30. 21 11,19 
13,20 
13,17 
15,21 


12: 


14 


Kh38 


24 


10 


14 
1 


10 


12 
0 
14 


10 


14 


13 
3 
12 


Kh39 


10 
10 
9 


15 19 15 14 22 12 
14 


13 
11 


12 


16 


20 


14 


12 
13 


24 
20 


12 


21 


12 


12 


21 


12 


21 


12 


21 


14 


16 


19 


15 


10 


23 


12 
12 


25 


22 


12 


10 


10 


12 


24 


13 


23 


3 


3 


14,19 


10,20 


2 


13,21 


1 


3 


13 


4 


4 


4 


10 


24 
22 
21 


28 


12: 


10 


27 
29 


13 


24 


30 
30 


22 


27 


21 


27 


30 
27 


22 


22 


28 


21 


30 
30 
28 


22 


23 


27 


23 


30 


22 


27 


21 


29 


21 


30 


21 


30 
28 


22 


22 


27 


24 


28 


22 


27 


21 


27 


12 


3 
2 


2 
2 


6 


4 


5 


4 


4 


3 


4 
4 


3 


6 


6 


2 


) 


5 


3 


5 


Kh40 


Kh41 


Kh42 


Kh43 


Kh44 


Kh45 


Kh46 


Kh47 


Kh48 


Kh49 


KhS0O 


KhS1 


Kh52 


KhS3 


KhS54 


KhS5 


KhS6 


KhS7 


KhS8 


KhS9 


Kh60 


Kh61 


Kh62 


Kh63 


Kh64 


Kh65 


Kh66 


Kh67 


Kh68 


Kh69 


Kh70 


Kh71 


Kh72 


Kh73 


Kh74 


Kh75 


Kh76 


Kh77 


Kh78 


Kh79 


Kh80 


Kh81 


134 


Appendices 


Ww Ww NY NY NY NY Ww UH Ww HH WwW nH nH Ff BP WA 


KR 


BN NY NY NY BW WwW BPW Ww NY Ww NY Ww WwW WwW HR FP Ww FP Ww NY NY NY Ww BPN Ww W NY WY NY NY WY FF NY NY WwW W LHL LF 


30 
27 
27 
30 
27 
27 
28 
27 
30 
30 
28 
30 
30 
28 
27 
at 
28 
30 
29 
28 
30 
28 
27 
27 
29 
30 
30 
27 
29 
27 
28 
29 
28 
27 
27 
29 
30 
30 
28 
30 
28 
30 
30 


oo 


SS SS eo SS OS a ee ee ee 


a 


No NY NY FY WB YW 


KR 


No N 


Ro NY fF FH NY NY OC NY fF NY Ff YY NY 


N Ww we & 


wiN 


11, 


— 
Nn 
NY 
o 


— 
w 
N 
= 


= 
we 
N 
= 


= 
= 


20 


Cn Mn MO DN KDB OOH NY NH DOD ON OH NN DO 


nN © NN 


NY DO 0 OM OM WON 


\o 


135 


Appendices 


BO KR BOR ONH VN BN N wD 


28 
28 
29 
27 
28 
29 
30 
30 
29 
27 
27 
30 
30 
30 
28 
27 
ay 
29 
30 
30 
27 
30 


21 
24 
2 
22 
22 
21 
23 
23 
21 
22 
24 
24 
22 
24 
22 
23 
22 
23 
22 
22 
2 
2 


coooccccvUcat 


wo Ww NY Fe WB NY WwW NY NY 


= 


Ww NN N 


a 
NO NM 


on Mm Om ON DN OH NN FF 


N 
o 


_ 


136 


Supplementary Table 2: Haplotypes of Non-Khandayat Population 


Appendices 


g _ 4 = Sb 
= Dn Nn Dn nN Dn Dn Dn DN Nn Nn Nn Nn Nn Nn DN < ia 
= 2644 4 64 6 & 4 6 2 62a a EE 
NKhl 14 13 27 21 11 12 12 15,18 8 11 14 19 15 16 19 11 1 
NKh2 15 12 28 22 10 14 11 13,18 10 13 15 19 15 16 23 12 1 
NKh3 15 12 27 23 10 12 12 13,20 9 11 15 19 16 17 ppp 10 1 
NKh4 13 14 30 24 10 12 13 11,20 11 10 14. 20 14 16 23 13 1 
NKh5 15 13 30 22 11 12 12 14,19 9 11 14 19 16 16 21 12 1 
NKh6 15 12 28 22 11 14 11 13,17 10 13 15 19 15 15 24 12 1 
NKh7 13 12 30 21 10 12 12 14,17 9 13 15 19 14 17 20 11 1 
NKh& 15 13 30 24 11 12 13 11,18 11 10 14. 20 14 15 23 12 1 
NKh9 13 12 28 22 11 14 11 14,17 10 13 15 19 17 15 21 13 1 
NKh10 12 13 27 22 10 12 13 14,18 10 12 15 19 15 16 22 11 1 
NKh11 12 13 27 22, 11 14 11 13,18 10 10 15 19 16 14. 24 12 1 
NKh12 12 12 28 22 9 14 11 14,17 11 12 15 19 16 15 21 12 1 
NKh13 12 14 30 23 10 12 13 14,17 9 11 14-20 15 17 24 11 1 
NKh14 13 13 29 21 11 12 12 11,18 11 10 14. 20 15 17 23 12 1 
NKhI15 13 12 28 23 10 12 12 14,18 10 14 15 17 16 16 21 12 1 
NKh16 14 12 30 24 10 14 13 13,16 11 11 15 19 15 16 22 11 1 
NKh17 14 14 29 22 10 12 12 15,18 9 12 14 19 16 18 21 12 1 
NKh18 13 13 30 22. 10 10 12 13,18 11 14 16 19 15 18 25 12 1 
NKh19 14 13 30 21 10 12 13 10,19 11 10 14. 20 15 17 23 12 1 
NKh20 13 12 28 23 10 12 12 14,18 10 14 15 17 16 16 21 12 1 
NKh21 14 12 30 24 10 14 13 13,16 11 11 15 19 15 16 22 11 1 
NKh22 15 12 28 22 11 14 11 13,17 10 12 15 19 15 15 23 12 1 
NKh23 14 12 30 24 10 12 13 13,21 9 11 14 19 14 18 21 12 1 
NKh24 13 14 30 21 10 12 14 11,19 11 10 14. 20 15 15 24 13 1 
NKh25 13 12 28 21 10 10 11 13,20 10 12 15 19 16 15 23 12 1 
NKh26 12 12 28 24 10 14 10 13,17 10 13 15 19 15 14. 22 12 1 
NKh27 15 13 29 21 11 11 13 LAL 11 10 14. 20 15 16 23 13 1 
NKh28 13 13 27 22 12 14 12 11,16 10 11 16 19 16 16 23 11 1 
NKh29 13 13 28 22 10 14 11 13,19 10 12 15 19 16 16 22 12 1 
NKh30 15 13 29 21 10 12 13 11,19 11 10 14-20 15 15 23 12 1 
NKh31 13 14 30 23 10 12 12 12,17 10 10 14 19 15 18 21 11 2 
NKh32 14 13 30 23 10 13 12 12,17 10 10 14 19 15 19 21 11 1 
NKh33 15 13 27 23 10 12 12 14,19 9 11 14 19 16 18 20 12 1 
NKh34 14 14 27 22 10 11 13 12,18 11 10 16 20 14 17 21 11 1 
NKh35 15 13 29 23 10 12 12 14,18 10 11 14 19 15 17 19 13 1 
NKh36 15 14 30 22 10 14 11 13,18 10 12 15 19 15 15 24 12 1 
NKh37 14 14 30 22 10 12 12 15,19 9 11 14-20 15 16 20 13 1 


137 


Appendices 


NKh38 
NKh39 
NKh40 
NKh41 
NKh42 
NKh43 
NKh44 
NKh45 
NKh46 
NKh47 
NKh48 
NKh49 
NKh50 
NKhS1 
NKh52 
NKh53 
NKh54 
NKh55 
NKh56 
NKh57 
NKh58 
NKh59 
NKh60 
NKhé61 
NKh62 
NKh63 
NKh64 
NKh65 
NKh66 
NKh67 
NKh68 
NKh69 
NKh70 
NKh71 
NKh72 
NKh73 
NKh74 
NKh75 
NKh76 
NKh77 
NKh78 
NKh79 
NKh80 


NKh81 


27 
29 
27 
28 
29 
27 
28 
29 
30 
30 
29 
2] 
27 
30 
30 
30 
28 
27 
27 
29 
30 
30 
27 
30 
28 
29 
27 
28 
29 
30 
30 
29 
27 
27 
30 
30 
27 
30 
27 
28 
30 
30 
30 
28 


21 
23 
22 
23 
21 
22 
22 
21 
23 
23 
21 
22 
24 
24 
22 
24 
22 
23 
22 
23 
22 
22 
21 
21 
24 
21 
22 
22 
21 
23 
23 
21 
22 
24 
24 
22 
21 
22 
22 
22 
22 
21 
22 
23 


15,20 
13,18 
12,17 
13,19 
11,17 
11,16 
13,19 
11,19 
12,17 
12,17 
15,17 
14,14 
13,17 
12,18 
15,17 
13,16 
13,17 
14,19 
12,18 
14,18 
13,18 
15,19 
15,20 
12,17 
14,16 
11,17 
11,16 
13,19 
11,19 
12,17 
12,17 
15,17 
14,14 
13,17 
12,18 
15,17 
14,19 
10,20 
16,17 
13,19 
15,17 
12,19 
13,18 
12,17 


Appendices 


NKh82 
NKh83 
NKh84 
NKh85 
NKh86 
NKh87 
NKh88 
NKh89 
NKh90 
NKh91 
NKh92 
NKh93 
NKh94 
NKh95 
NKh96 
NKh97 
NKh98 
NKh99 
NKh100 
NKhI101 
NKh102 
NKh103 
NKh104 
NKh105 
NKh106 
NKh107 
NKh108 
NKh109 
NKh110 
NKhI111 
NKh112 
NKh113 
NKh114 
NKh115 
NKh116 
NKh117 
NKh118 
NKh119 
NKh120 
NKh121 
NKh122 
NKh123 
NKh124 


NKh125 


27 
30 
30 
27 
27 
30 
29 
30 
30 
28 
28 
at 
29 
29 
27 
30 
30 
28 
27 
27 
30 
27 
27 
28 
27 
30 
30 
28 
30 
30 
28 
2h 
27 
28 
30 
29 
28 
27 
30 
27 
28 
30 
30 
30 


22 
23 
24 
22 
22 
22 
21 
24 
21 
21 
24 
22 
21 
21 
23 
24 
23 
22 
24 
22 
21 
21 
21 
22 
23 
24 
22 
22 
21 
24 
22 
22 
22 
22 
23 
21 
23 
21 
22 
22 
22 
22 
21 
22 


13,21 
13,17 
12,18 
14,20 
14,17 
15,17 
11,19 
13,21 
11,19 
13,20 
13,17 
15,21 
18,19 
18,19 
12,20 
14,17 
13,20 
13,19 
14,19 
15,21 
12,17 
15,18 
15,18 
13,18 
13,20 
11,20 
14,19 
13,17 
14,17 
11,18 
14,17 
14,18 
13,18 
14,17 
14,17 
11,18 
14,18 
14,19 
10,20 
16,17 
13,19 
15,17 
12,19 
13,18 


139 


Appendices 


NKh126 139012 28 23 10 12 13 © 12,17 9 11 61506 (21) 16 15 1 
NKh127 130 11 6 27)0«-22)—C—10- ss 14) 11s 13,21 100121519 16 140 21 1 
NKh128 13. 14 30 = 23 9 10 13 13,17) 11 12 16 19 16 17 24 = «12 1 
NKh129 14 14 30 24 10 14 «+112 «12,18 $410 11 14 #18 15 «170 « 23)~6«(10 1 
NKh130 14 13 27 22 10 12 «13~©«©14,20 10 «14 «150619 61606:1700~«622~«O 1 
NKh131 13. 12 27 22 #10 «14 ~«11)0«614,17 10 13) 14 = «19 «1506 «1500 (2512 1 
NKh132 BB (3 30 22 “10° 12 42. 15,17 9 11 14 #19 17 16 22 «12 1 
NKh133 iF “13> 29 21 9 12 13 11,19 11 10 14 20 16 15 23 13 1 
NKh134 13, 13 27) 24 «©1006«614 0) 13s«13,18 =61t 11 16 20 1516 2s 1 
NKh135 144 14 29 22 10 12 12 =~ 15,18 9 12 14 19 16 18 21 12 1 
NKh136 13. 13 30 22 «100 «6©6100612)0«613,18 Il 14 16 19 15 18 25 12 1 
NKh137 14 13 30 21 #10 12 «13©«©10,19 1 10 14 20 15 17) 23=« 12 1 
NKh138 14 14 30 23 11 13 12 = 16,17 9 11 14 #19 #15 16 20 = 12 1 
NKh139 1S 13) 27) 22 10 12 12 lt 9 12 14 20 16 18 #19 += 12 1 
NKh140 16 13 30 24 10 12 «13061320 11 10 14 20 0 15) «61600 «623613 1 


NKh141 12 14 29 23 10 13 14 12,18 11 11 #17 19 17) 18 25 12 1 
NKh142 13. 14 30 = 23 9 10 13 «13,17. 11 12 16 19 16 17) 24 = «12 1 
NKh143 14 14 30 24 10 14 «+12 = «12,18 $410 11 14 #18 15 6170 «©.23)6«(10 1 


140 


Bibliography 


Alberts B, Johnson A, Lewis J, Raff M, Roberts K, Walter P (2002). 
Molecular biology of the cell, 4" edition. Garland Press, NY. 

Allen RC, Graves G, Budowle B (1989). Polymerase chain reaction 
amplification products separated on rehydratable polyacrylamide gels and 
stained with silver. Biotechniques 7(7):736-744. 

Armour JA, Wong Z, Wilson V, Royle NJ, Jeffreys AJ (1989). Sequences 
flanking the repeat arrays of human minisatellites: association with 
tandem and dispersed repeat elements. Nucleic Acids Res 17(13):4925- 
4935. 

Avery OT, MacLeod CM, McCarty M (1944). Studies on the chemical 
nature of the substance inducing transformation of Pneumococcal types. J 
Exp Med 79:137-159. 

Ayub Q, Mohyuddin A, Qamar R, Mazhar K, Zerjal T, Mehdi SQ, Tyler- 
Smith C (2000). Identification and characterisation of novel human Y- 
chromosomal microsatellites from sequence database information. Nucleic 
Acids Res 28(2):e8. 

Bailey JA, Gu ZP, Clark RA, Reinert K, Samonte RV, Schwartz S, Adams 
DM, Myers EW, Li PW, Eichler EE (2002). Recent segmental 
duplications in the human genome. Science 297:1003-1007. 

Bairagya BB, Bhattacharya P, Bhattacharya SK, Dey B, Dey U, Ghosh T, 
Maiti S, Majumder PP, Mishra K, Mukherjee S, Mukherjee S, 
Narayanasamy K, Poddar S, Roy NS, Sengupta P, Sharma S, Sur D, 
Sutradhar D, Wagener DK (2008). Genetic variation and haplotype 
structures of innate immunity genes in eastern India. Infect Genet Evol 
8(3):360-6. 

Balamurugan K, Suhasini G, Vijaya M, Kanthimathi S, Mullins N, Tracey 
M, Duncan G (2010).Y chromosome STR allelic and haplotype diversity 


100 


Bibliography 


in five ethnic Tamil populations from Tamil Nadu, India. Leg Med 
(Tokyo) 12(5):265-9. 

Ballantyne KN, Goedbloed M, Fang R, Schaap O, Lao O, Wollstein A, 
Choi Y, van Duijn K, Vermeulen M, Brauer S, Decorte R, Poetsch M, von 
Wurmb-Schwark N, de Knijff P, Labuda D, Vézina H, Knoblauch H, 
Lessig R, Roewer L, Ploski R, Dobosz T, Henke L, Henke J, Furtado MR, 
Kayser M (2010). Mutability of Y-chromosomal microsatellites: rates, 
characteristics, molecular bases, and forensic implications. dm J Hum 
Genet 87(3):341-53. 

Bamshad M, Kivisild T, Scott Watkins W, Dixon ME, Ricker CE, Rao 
BB, Mastan Naidu J, Ravi Prasad BV, Govinda Reddy P, Rasanayagam A, 
Papiha SS, Villems R, Redd AJ, Hammer MF, Nguyen SV, Carroll ML, 
Batzer MA, Jorde LB (2001). Genetic evidence on the origins of Indian 
caste populations. Genome Res 11:994-1004. 

Bamshad M, Kivisild T, Watkins WS, Dixon ME, Ricker CE, Rao BB, 
Naidu JM, Prasad BV, Reddy PG, Rasanayagam A, et al. (2001). Genetic 
evidence on the origins of Indian caste populations. Genome Res 11:994- 
1004. 

Bamshad M, Wooding SP (2003). Signatures of natural selection in the 
human genome. Nat Rev Genet 4:99-111. 

Bar W, Brinkmann B, Budowle B, Carracedo A, Gill P, Lincoln P, Mayr 
W, Olaisen B (1997). DNA recommendations. Further report of the DNA 
Commission of the ISFG regarding the use of short tandem repeat 
systems. Forensic Sci Int 87(3):179-184. 

Barik SS, Sahani R, Prasad BV, Endicott P, Metspalu M, Sarkar BN, 
Bhattacharya S, Annapoorna PC, Sreenath J, Sun D, et a/. (2008). Detailed 


101 


Bibliography 


mtDNA genotypes permit a reassessment of the settlement and population 
structure of the Andaman Islands. Am J Phys Anthropol 136:19-27. 
Barnabas S, Apte RV, Suresh CG (1996). Ancestry and interrelationships 
of the Indians and their relationship with other world populations a study 
based on mitochondrial DNA polymorphisms. Ann Hum Genet 60:409- 
422. 

Barreiro LB, Laval G, Quach H, Patin E, Quintana-Murci L (2008). 
Natural selection has driven population differentiation in modern humans. 
Nat Genet 40(3):340-345. 

Basu A, Mukherjee N, Roy S, Sengupta S, Banerjee S, Chakraborty M, 
Dey B, Roy M, Roy B, Bhattacharyya NP, Roychoudhury S, Majumder 
PP (2003). Ethnic India: a genomic view, with special reference to 
peopling and structure. Genome Res 13(10):2277-90. 

Beleza S, Alves C, Gonzales-Neira A, Lareu M, Amorim A, Carracedo A, 
Gusmao L (2003). Extending STR markers in Ychromosome haplotypes. 
Int J Legal Med 117(1):27-33. 

Bell GI, Selby MJ, Rutter WJ (1982). The highly polymorphic region near 
the human insulin gene is composed of simple tandemly repeating 
sequences. Nature 295(5844):31-35. 

Bhasin MK, Walter H (2001). Genetics of Castes and Tribes of India. 
Kamla-Raj Enterprises, Delhi. 

Bhattacharyya NP, Basu P, Das M, Pramanik S, Banerjee R, Roy B, 
Roychoudhury S, Majumder PP (1999). Negligible Male Gene Flow 
Across Ethnic Boundaries in India, Revealed by Analysis of Y- 
Chromosomal DNA Polymorphisms. Genome Res 9:711-719. 

Bieber FR, Brenner CH, Lazer D (2006). Finding criminals through DNA 
of their relatives. Science 312(5778):1315-1316. 


102 


Bibliography 


Biemont C, Vieira C (2006). Junk DNA as an evolutionary force. Nature 
443:521-524. 

Boerwinkle E, Xiong WJ, Fourest E, Chan L (1989). Rapid typing of 
tandemly repeated hypervariable loci by the polymerase chain reaction: 
application to the apolipoprotein B 3' hypervariable region. Proc Nat Acad 
Sci USA 86(1):212-216. 

Bosch E, Lee AC, Calafell F, Arroyo E, Henneman P, de Knijff P, Jobling 
MA (2002). High resolution Y-chromosome typing: 19 STRs amplified in 
three multiplex reactions. Forensic Sci Int 125(1):42-51. 

Bower B (2000). ‘Y guy’ steps into human-evolution debate. Science 
News 158:295. 

Bower B (2003). Y trail of the first Americans: DNA data point to late 
New World entry. Science News 164:212. 

Britten RJ, Kohne DE (1968). Repeated Sequences in DNA. Science 
161(3841):529-540. 

Brown K (2002). Tangled roots? Genetics meets genealogy. Science 
295:1634-1635. 

Brown TA (2002). Genomes, 2" edition. Garland Science Publisher, NY. 
Budowle B, Chakraborty R, Giusti AM, Eisenberg AJ, Allen RC (1991). 
Analysis of the VNTR locus D1S80 by the PCR followed by high- 
resolution PAGE. Am J Hum Genet 48(1):137-144. 

Budowle B, Ge J, Low J, Lai C, Yee WH, Law G, Tan WF, Chang YM, 
Perumal R, Keat PY, Mizuno N, Kasai K, Sekiguchi K, Chakraborty R 
(2009). The effects of Asian population substructure on Y STR forensic 
analyses. Leg Med (Tokyo) 11(2):64-9. 

Butler JM (2001). Forensic DNA Typing: Biology and Technology behind 
STR Markers. Elsevier Academic Press, MA. 


103 


Bibliography 


Butler JM (2003). Recent developments in Y-short tandem repeat and Y- 
single nucleotide polymorphism analysis. Forensic Sci Rev 15:91-111. 
Butler JM (2005). Forensic DNA Typing: Biology, Technology, and 
Genetics of STR Markers, 2": edition. Elsevier Academic Press, MA. 
Butler JM (2006). Genetics and genomics of core short tandem repeat loci 
used in human identity testing. J Forensic Sci 51:253-265. 

Butler JM (2007). Short tandem repeat typing technologies used in human 
identity testing. Biotechniques 43(4):Sii-Sv. 

Butler JM (2010). Fundamentals of Forensic DNA Typing. Elsevier 
Academic Press, MA. 

Butler JM (2012). Advanced Topics in Forensic DNA _ Typing: 
Methodology. Elsevier Academic Press, MA. 

Butler JM, Devaney JM, Marino MA, Vallone PM (2001). Quality control 
of PCR primers used in multiplex STR amplification reactions. Forensic 
Sci Int 119(1):87-96. 

Butler JM, Kline MC, Decker AE (2008). Addressing Y-chromosome 
short tandem repeat (Y-STR) allele nomenclature. J Genet Geneal 
4(2):125-148. 

Butler JM, Ruitberg CM, Vallone PM (2001). Capillary electrophoresis as 
a tool for optimization of multiplex PCR reactions. Fresenius J Anal Chem 
369(3-4):200-205. 

Butler JM, Schoske R, Vallone PM, Kline MC, Redd AJ, Hammer MF 
(2002). A novel multiplex for simultaneous amplification of 20 Y- 
chromosome STR markers. Forensic Sci Int 129(1):10-24. 

Callaway E (2012). Archaeology Date with history. Nature 485:27-29. 


104 


Bibliography 


Carvalho-Silva DR, Santos FR, Hutz MH, Salzano FM, Pena SDJ (1999). 
Divergent human Y-chromosome microsatellite evolution rates. J Mol 
Evol 49:204-214. 

Cavalli-Sforza LL (2005). The Human Genome Diversity Project: past, 
present and future. Nat Rev Genet 6:333-340. 

Cavalli-Sforza LL, Menozzi P, Piazza A (1996). The History and 
Geography of Human Genes. Princeton University Press, NJ. 

Cerri N, Ricci U, Sani I, Verzeletti A, Ferrari FD (2003). Mixed stains 
from sexual assault cases: autosomal or Y-chromosome short tandem 
repeats? Croat Med J 44(3):289-292. 

Chakraborty R (1985). Paternity testing with genetic markers: are Y- 
linked genes more efficient than autosomal ones? Am J Med Genet 
21:298-305. 

Chamyal LS, Maurya DM, Raj R, Juyal N, Bhandari S, Pant RK, Gaillard 
C (2011). Discovery of a Robust Fossil Homo sapiens in India (Orsang 
River Valley, Lower Narmada Basin, Gujarat). Possible Continuity with 
Asian Homo erectus. Acta Anthropologica Sinica 2:158-191. 
Chandrasekar A, Kumar S, Sreenath J, Sarkar BN, Urade BP, Mallick S, 
Bandopadhyay SS, Barua P, Barik SS, Basu D, et al. (2009). Updating 
phylogeny of mitochondrial DNA macrohaplogroup m in India dispersal 
of modern human in South Asian corridor. PLoS One 4:e7447. 

Chargaff E (1950). Chemical specificity of nucleic acids and mechanism 
of their enzymatic degradation. Experientia 6(6):201-209. 

Chargaff E (1971). Preface to a grammar of biology. A hundred years of 
nucleic acid research. Science 172(3984):637-642. 

Charlesworth B, Charlesworth D (2000). The degeneration of Y 
chromosomes. Philos Trans R Soc Lond B Biol Sci 355:1563-1572. 


105 


Bibliography 


Chaubey G, Karmin M, Metspalu E, Metspalu M, Selvi-Rani D, Singh 
VK, Parik J, Solnik A, Naidu BP, Kumar A, ef al. (2008). Phylogeography 
of mtDNA haplogroup R7 in the Indian peninsula. BMC Evol Biol 8:227. 
Chaubey G, Metspalu M, Choi Y, Magi R, Romero IG, Soares P, van 
Oven M, Behar DM, Rootsi S, Hudjashov G, et al. (2011). Population 
genetic structure in Indian Austroasiatic speakers the role of landscape 
barriers and sex-specific admixture. Mol Biol Evol 28:1013-1024. 
Chennakrishnaiah S, Perez D, Gayden T, Rivera L, Regueiro M, Herrera 
RJ (2013). Indigenous and foreign Y-chromosomes characterize the 
Lingayat and Vokkaliga populations of Southwest India. Gene 526(2):96- 
106. 

Coble MD, Loreille OM, Wadhams MJ, Edson SM, Maynard K, Meyer 
CE, Niederstatter H, Berger C, Berger B, Falsetti AB, Gill P, Parson W, 
Finelli LN (2009). Mystery solved: the identification of the two missing 
Romanov children using DNA analysis. PLoS One 4(3):e4838 

Collins FS, Brooks LD, Chakravarti A (1998). A DNA Polymorphism 
Discovery Resource for Research on Human Genetic Variation. Genome 
Res 8:1229-1231. 

Comey CT, Budowle B (1991). Validation studies on the analysis of the 
HLA DQ alpha locus using the polymerase chain reaction. J Forensic Sci 
36(6):1633-1648. 

Cooper DN (2006). Human Gene Evolution. Academic Press, CA. 

Corach D, Filgueira RL, Marino M, Penacino G, Sala A (2001). Routine 
Y-STR typing in forensic casework. Forensic Sci Int 118(2-3):131-5. 
Dahm R (2008). Discovering DNA: Friedrich Miescher and the early 
years of nucleic acid research. Hum Genet 122(6):565-581. 


106 


Bibliography 


D'Aquila RT, Bechtel LJ, Videler JA, Eron JJ, Gorezyca P, Kaplan JC 
(1991). Maximizing sensitivity and specificity of PCR by pre- 
amplification heating. Nucleic Acids Res 19(13):3749. 

Das B, Chauhan PS, Seshadri M (2004). Minimal sharing of Y- 
chromosome STR haplotypes among five endogamous population groups 
from western and southwestern India. Hum Biol 76(5):743-63. 

de Knijff P (2000). Messages through bottlenecks: on the combined use of 
slow and fast evolving polymorphic markers on the human Y 
chromosome. Am J Hum Genet 67:1055-1061. 

Devlin TM (2010). Textbook of Biochemistry with Clinical Correlations, 
7" edition. John Wiley & Sons, NY. 

Dib C, Faure S, Fizames C, Samson D, Drouot N, Vignal A, Millasseau P, 
Marc S, Hazan J, Seboun E, Lathrop M, Gyapay G, Morissette J, 
Weissenbach J (1996). A comprehensive genetic map of the human 
genome based on 5,264 microsatellites. Nature 380(6570):152-4. 

DNA recommendations-1994 report concerning further recommendations 
of the DNA Commission of the ISFH regarding PCR-based 
polymorphisms in STR (short tandem repeat) systems. Int J Legal Med 
107(3):159-160. 

Eaaswarkhanth M, Dubey B, Meganathan PR, Ravesh Z, Khan FA, Singh 
L, Thangaraj K, Haque I (2009). Diverse genetic origin of Indian Muslims 
evidence from autosomal STR loci. J Hum Genet 54:340-348. 

Edwards A, Civitello A, Hammond HA, Caskey CT (1991). DNA typing 
and genetic mapping with trimeric and tetrameric tandem repeats. Am J 
Hum Genet 49(4):746-756. 

Elhaik E, Tatarinova TV, Klyosov AA, Graur D (2014). The ‘extremely 


ancient’ chromosome that isn’t: a forensic bioinformatic investigation of 


107 


Bibliography 


Albert Perry’s X-degenerate portion of the Y chromosome. Eur J Hum 
Genet 22:1111-1116. 

Ellegren H (2004). Microsatellites: simple sequences with complex 
evolution. Nat Rev Genet 5:435-445. 

Ford CE, Hamerton JL (1956). The chromosomes of 
man. Nature 178:1020-1023. 

Foster EA et al. (1998). Jefferson fathered slave’s last child. Nature 
396:27-28. 

Francalacci P, Morelli L, Angius A, Berutti R, Reinier F, Atzeni R, Pilu R, 
Busonero F, Maschio A, Zara I, Sanna D, Useli A, Urru MF, Marcelli M, 
Cusano R, Oppo M, Zoledziewska M, Pitzalis M, Deidda F, Porcu E, 
Poddie F, Kang HM, Lyons R, Tarrier B, Gresham JB, Li B, Tofanelli S, 
Alonso S, Dei M, Lai S, Mulas A, Whalen MB, Uzzau S, Jones C, 
Schlessinger D, Abecasis GR, Sanna S, Sidore C, Cucca F (2013). Low- 
pass DNA sequencing of 1200 Sardinians reconstructs European Y- 
chromosome phylogeny. Science 341:565-569. 

Frank WE, Ralph HC, Tahir MA (2008). Y chromosome STR haplotypes 
and allele frequencies in a southern Indian male population. J Forensic Sci 
53:248-251. 

Franklin RE, Gosling RG (1953). Molecular configuration in sodium 
thymonucleate. Nature 171:740-741. 

Gann A, Witkowski J (2012). The Annotated and Illustrated Double Helix. 
Simon & Schuster Inc., NY. 

Gardner EJ, Simmons MJ, Snustad DP (2002). Principles of Genetics, 8" 
edition. John Wiley & Sons, NY. 

Gartler SM (2006). The chromosome number in humans: A brief 
history. Nat Rev Genet 7:655-660. 


108 


Bibliography 


Gayden T, Chennakrishnaiah S, La Salvia J, Jimenez S, Regueiro M, 
Maloney T, Persad PJ, Bukhari A, Perez A, Stojkovic O, Herrera RJ 
(2011). Y-STR diversity in the Himalayas. Jnt J Legal Med 125(3):367-75. 
Ge J, Budowle B, Planz JV, Eisenberg AJ, Ballantyne J, Chakraborty R 
(2010). US forensic Y-chromosome short tandem repeats database. Leg 
Med (Tokyo) 12(6):289-95. 

Ghosh T, Kalpana D, Mukerjee S, Mukherjee M, Sharma AK, Nath S, 
Rathod VR, Thakar MK, Jha GN (2011). Genetic diversity of 17 Y-short 
tandem repeats in Indian population. Forensic Sci Int Genet 5(4):363-7. 
Gill P (2002). Role of short tandem repeat DNA in forensic casework in 
the UK-past, present, and future perspectives. Biotechniques 32:366-372. 
Gill P, Brenner C, Brinkmann B, Budowle B, Carracedo A, Jobling MA, 
De K, Kayser M, Krawczak M, Mayr WR, Morling N, Olaisen B, Pascali 
V, Prinz M, Roewer L, Schneider PM, Sajantila A, Tyler-smith C (2001). 
DNA Commission of the International Society of Forensic Genetics: 
Recommendations on forensic analysis using Y-chromosome STRs. 
Forensic Sci Int 124:5-10. 

Gill P, Jeffreys AJ, Werrett DJ (1985). Forensic application of DNA 
‘fingerprints’. Nature 318(6046):577-579. 

Giroti R, Talwar I (2010). The most ancient democracy in the world is a 
genetic isolate: an autosomal and Y-chromosome study of the hermit 
village of Malana (Himachal Pradesh, India). Hum Biol 82(2):123-41. 
Gomolka M, Hundrieser J, Niirnberg P, Roewer L, Epplen JT, Epplen C 
(1994). Selected di- and tetranucleotide microsatellites from chromosomes 
7, 12, 14, and Y in various Eurasian populations. Hum Genet 93:592-596. 
Gonzalez-Neira A, Elmoznino M, Lareu MV, Sanchez-Diz P, Gusmao L, 


Prinz M, Carracedo A (2001). Sequence structure of 12 novel Y 


109 


Bibliography 


chromosome microsatellites and PCR amplification strategies. Forensic 
Sci Int 122(1):19-26. 

Goodbourn SE, Higgs DR, Clegg JB, Weatherall DJ (1983). Molecular 
basis of length polymorphism in the human zeta-globin gene 
complex. Proc Natl Acad Sci USA 80(16):5022-5026. 

Graves JA (1995). The origin and function of the mammalian Y 
chromosome and Y-borne genes - an evolving understanding. Bioessays 
17:311-320. 

Graves JAM, Wakefield MJ, Toder R (1998). Evolution of the 
pseudoautosomal region of mammalian sex chromosomes. Hum Mol 
Genet 7:1991-1996. 

Griffith F (1928). The significance of pneumococcal types. J Hyg 
(London) 27(2):113-159. 

Griffiths AJF, Wessler SR, Carroll SB, Doebley J (2012). Introduction to 
Genetics, 10 edition. W. H. Freeman & Co., NY. 

Griffiths RAL, Barber MD, Johnson PE, Gillbard SM, Haywood MD, 
Smith CD, Arnold J, Burke T, Urquhart AJ, Gill P (1998). New reference 
allelic ladders to improve allelic designation in a multiplex STR system. 
Int J Legal Med 111(5):267-272. 

Grignani P, Peloso G, Fattorini P, Previderé C (2000). Highly informative 
Y-chromosomal haplotypes by the addition of three new STRs DYS437, 
DYS438 and DYS439. Int J Legal Med 114(1-2):125-129. 

Gusmao L, Brion M, Gonzalez-Neira A, Sanchez-Diz P, Lareu MV, 
Carracedo A (1999). Y chromosome specific polymorphisms in forensic 
analysis. Legal Med 1:55-60. 

Gusmao L, Butler JM, Carracedo A, Gill P, Kayser M, Mayr WR, Morling 
N, Prinz M, Roewer L, Tyler-smith C, Schneider PM (2006). DNA 


110 


Bibliography 


Commission of the International Society of Forensic Genetics. DNA 
Commission of the International Society of Forensic Genetics (ISFG): an 
update of the recommendations on the use of Y-STRs in forensic analysis. 
Forensic Sci Int 157:187-97. 

Hall A, Ballantyne J (2003). Strategies for the design and assessment of 
Y-short tandem repeat multiplexes for forensic use. Forensic Sci Rev 
15:137-149. 

Hamilton MB (2009). Population Genetics. John Wiley & Sons, NJ. 
Hammer MF (1995). A recent common ancestry for human Y 
chromosomes. Nature 378:376-378. 

Hammond HA, Jin L, Zhong Y, Caskey CT, Chakraborty R (1994). 
Evaluation of 13 short tandem repeat loci for use in personal identification 
applications. Am J Hum Genet 55(1):175-189. 

Hanson EK, Ballantyne J (2007). An Ultra-High Discrimination Y 
Chromosome Short Tandem Repeat Multiplex DNA Typing System. PLoS 
One 8:e688. 

Hartl DL, Clark AG (2006). Principles of Population Genetics. Sinauer 
Associates Inc. Sunderland, MA. 

Hedrick PW (2005). Genetics of Populations, 3" edition. Jones & Bartlett 
Publishers, Sudbury, MA. 

Hershey AD, Chase M (1952). Independent functions of viral protein and 
nucleic acid in growth of bacteriophage. J Gen Physiol 36:39-56. 

Higuchi R, von Beroldingen CH, Sensabaugh GF, Erlich HA (1988). DNA 
typing from single hairs. Nature 332(6164):543-546. 

Hochmeister MN, Budowle B, Borer UV, Eggmann U, Comey CT, 
Dirnhofer R (1991). Typing of deoxyribonucleic acid (DNA) extracted 


from compact bone from human remains. J Forensic Sci 36(6):1649-1661. 


111 


Bibliography 


Hochmeister MN, Budowle B, Jung J, Borer UV, Comey CT, Dirnhofer R 
(1991). PCR-based typing of DNA extracted from cigarette butts. Int J 
Legal Med 104(4):229-233. 

Hoff-Olson P et al., (1999). Extraction of DNA from decomposed human 
tissue: an evaluation of five extraction methods for short tandem repeat 
typing. Forensic Sci Int 105:171-183. 

Hood L, Galas D (2003). Feature The digital code of DNA. Nature 
421,444-448. 

Horn GT, Richards B, Klinger KW (1989). Amplification of a highly 
polymorphic VNTR segment by the polymerase chain reaction. Nucleic 
Acids Res 17(5):2140-2140. 

Housman D (1995). Human DNA Polymorphism. N Engl J Med 332:318- 
320. 

Hsu TC (1952). Mammalian chromosomes in vitro: Karyotype of man. J 
Heredity 43:167-172. 

Hughes JF, Rozen S (2012). Genomics and Genetics of Human and 
Primate Y Chromosomes. Annu Rev Genomics Hum Genet 13:83-108. 
Hurles ME, Jobling MA (2001). Haploid chromosomes in molecular 
ecology: lessons from the human. Mol Ecol 10:1599-613. 

Indian Genome Variation Consortium (2008). Genetic landscape of the 
people of India: a canvas for disease gene exploration. J Genet 87:3-20. 
International HapMap Consortium (2003). The International HapMap 
Project. Nature 426:789-796. 

International HapMap Consortium (2005). A haplotype map of the human 
genome. Nature 437:1299-1320. 

International Human Genome Sequencing Consortium (2004). Finishing 


the euchromatic sequence of the human genome. Nature 431:931-945. 


112 


Bibliography 


James SH, Nordby JJ (2005). Forensic Science-An Introduction to 
Scientific & Investigative Techniques, 2™ edition. CRC Press, Boca 
Raton, FL. 

Jeffreys AJ, Wilson V, Neumann R, Keyte J (1988). Amplification of 
human minisatellites by the polymerase chain reaction: towards DNA 
fingerprinting of single cells. Nucleic Acids Res 16(23):10953-10971. 
Jeffreys AJ, Wilson V, Thein SL (1985). Hypervariable 'minisatellite' 
regions in human DNA. Nature 314(6006):67-73. 

Jeffreys AJ, Wilson V, Thein SL (1985). Individual-specific 'fingerprints' 
of human DNA. Nature 316(6023):76-79. 

Jeffreys AJ, Wilson V, Thein SL, Weatherall DJ, Ponder BA (1986). DNA 
"fingerprints" and segregation analysis of multiple markers in human 
pedigrees. Am J Hum Genet 39(1):11-24. 

Jegalian K, and Lahn BT (2001). Why the Y is so weird. Scientific 
American 284: 56-61. 

Jobling MA (2012). The impact of recent events on human genetic 
diversity. Philos Trans R Soc Lond B Biol Sci 367:793-799. 

Jobling MA, Tyler-Smith C (2003). The human Y chromosome: an 
evolutionary marker comes of age. Nat Rev Genet 4:598-612. 

Kasai K, Nakamura Y, White R (1990). Amplification of a variable 
number of tandem repeats (VNTR) locus (pMCT118) by the polymerase 
chain reaction (PCR) and its application to forensic Science. J Forensic 
Sci 35(5):1196-1200. 

Kayser M, Caglia A, Corach D, Fretwell N, Gehrig C, Graziosi G, 
Heidorn F, Herrmann S, Herzog B, Hidding M, Honda K, Jobling M, 
Krawezak M, Leim K, Meuser S, Meyer E, Oesterreich W, Pandya A, 


Parson W, Penacino G, Perez-Lezaun A, Piccinini A, Prinz M, Schmitt C, 


113 


Bibliography 


Schneider PM, Szibor R, Teifel-Greding J, Weichhold GM, de Knijff P, 
Roewer L (1997). Evaluation of Y-chromosomal STRs: a multicenter 
study. Int J Legal Med 110(3):125-33, 141-9. 

Kayser M, Kittler R, Erler A, Hedman M, Lee AC, Mohyuddin A, Mehdi 
SQ, Rosser Z, Stoneking M, Jobling MA, Sajantila A, Tyler-Smith C 
(2004). A comprehensive survey of human Y-chromosomal 
microsatellites. Am J Hum Genet 74(6):1183-1197. 

Kayser M, Sajantila A (2001). Mutations at Y-STR loci: implications for 
paternity testing and forensic analysis. Forensic Sci Int 118(2-3):116-21. 
Kendall EC and Osterberg AE (1919). The chemical identification of 
Thyroxin. J Biol Chem 40:265-334. 

Khurana P, Aggarwal A, Mitra S, Italia YM, Saraswathy KN, 
Chandrasekar A, Kshatriya GK (2014). Y Chromosome Haplogroup 
Distribution in Indo-European Speaking Tribes of Gujarat, Western India. 
PLoS One 9(3):e90414. 

Kimpton C, Fisher D, Watson S, Adams M, Urquhart A, Lygo J, Gill P 
(1994). Evaluation of an automated DNA profiling system employing 
multiplex amplification of four tetrameric STR loci. Int J Legal Med 
106(6):302-311. 

Kimpton CP, Gill P, Walton A, Urquhart A, Millican ES, Adams M 
(1993). Automated DNA profiling employing multiplex amplification of 
short tandem repeat loci. PCR Methods Appl 3(1):13-22. 

Kimpton CP, Oldroyd NJ, Watson SK, Frazier RRE, Johnson PE, Millican 
ES, Urquhart A, Sparkes BL, Gill P (1996). Validation of highly 
discriminating multiplex short tandem repeat amplification systems for 


individual identification. Electrophoresis 17(8):1283-1293. 


114 


Bibliography 


Kirby LT (1992). DNA Fingerprinting: An Introduction. W.H. Freeman 
and Company, NY. 

Kivisild T, Rootsi S, Metspalu M, Mastana S, Kaldma K, Parik J, 
Metspalu E, Adojaan M, Tolk HV, Stepanov V, Gélge M, Usanga E, 
Papiha SS, Cinnioglu C, King R, Cavalli-Sforza L, Underhill PA, Villems 
R (2003). The genetic heritage of the earliest settlers persists both in 
Indian tribal and caste populations. Am J Hum Genet 72(2):313-32. 

Klug WS, Cummings MR, Spencer CA (2006). Concepts of Genetics, 8" 
edition. Pearson Educalion, Inc., NJ. 

Lahn BT, Pearson NM, Jegalian K (2001). The human Y chromosome, in 
the light of evolution. Nat Rev Genet 2:207-216. 

Lander ES, et al (2001). Initial sequencing and analysis of the human 
genome. Nature 409(6822):860-921. 

Lawrence K, Liotti TF, Oeser-SJ (2005). DNA: Forensic & Legal 
Applications. John Wiley & Sons, NJ. 

Lederberg J (1994). Honoring Avery, MacLeod, and McCarty: The team 
that transformed genetics. The Scientist 8:11. 

Levene PA (1919). The structure of yeast nucleic acid. IV. Ammonia 
hydrolysis. J Biol Chem 40:415—-424. 

Lewin B (2007). Genes LX, 9" edition. Oxford University Press, NY. 

Litt M, Lutty JA (1989). A hypervariable microsatellite revealed by in 
vitro amplification of a dinucleotide repeat within the cardiac muscle actin 
gene. Am J Hum Genet 44(3):397-401. 

Livak KJ, Flood SJ, Marmaro J, Giusti W, Deetz K (1995). 
Oligonucleotides with fluorescent dyes at opposite ends provide a 
quenched probe system useful for detecting PCR product and nucleic acid 


hybridization. PCR Methods Appl 4(6):357-62. 


115 


Bibliography 


Ludwig EH, Friedl W, McCarthy BJ (1989). High-resolution analysis of a 
hypervariable region in the human apolipoprotein B gene. Am J Hum 
Genet 45(3):458-464. 

Macaulay V, Hill C, Achilli A, Rengo C, Clarke D, Meehan W, Blackburn 
J, Semino O, Scozzari R, Cruciani F, et al. (2005) Single, rapid coastal 
settlement of Asia revealed by analysis of complete mitochondrial 
genomes. Science 308:1034-1036. 

Majumder PP (1998). People of India: Biological diversity and affinities. 
Evol Anthropol 6:100-110. 

Malaspina P, Persichetti F, Noveletto A, Iodice C, Terrenato L, Wolfe J, 
Ferraro M, Prantera G (1990). The human Y chromosome shows a low 
level of DNA polymorhism. Ann Hum Genet 54:297-305. 

Markoulatos P, Siafakas N, Moncany M (2002). Multiplex polymerase 
chain reaction: a practical approach. J Clin Lab Anal 16(1):47-51. 

Maroni G (2001). Molecular and Genetic Analysis of Human Traits. 
Blackwell Science Inc., MA. 

Mathias N, Bayes M, Tyler-Smith C (1994). Highly informative 
compound haplotypes for the human Y chromosome. Hum Mol Genet 
3:115-123. 

Maxam A, Gilbert W (1977). A new method of sequencing DNA. Proc 
Natl Acad Sci USA 74:560-564. 

McCarty M (1994). A retrospective look: How we identified the 
pneumococcal transforming substance as DNA. J Exp Med 179:385-394. 
Melissa A, Sayres W, Lohmueller KE, Nielsen R (2014). Natural 
Selection Reduced Diversity on Human Y Chromosomes. PLoS Genet 


10(1):e1004064 


116 


Bibliography 


Mendez FL, Krahn T, Schrack B, Krahn AM, Veeramah KR, Woerner 
AE, Fomine FL, Bradman N, Thomas MG, Karafet TM, Hammer MF 
(2013). An African American paternal lineage adds an extremely ancient 
root to the human Y chromosome phylogenetic tree. Am J Hum Genet 
92(3):454-9. 

Meselson M, Yuan R (1968). DNA restriction enzyme from E. coli. 
Nature 217:1110-1114. 

Misra VN (2001). Prehistoric human colonization of India. J Biosci 
26:491-531. 

Mountain JL, Hebert JM, Bhattacharyya S, Underhill PA, Ottolenghi C, 
Gadgil M, Cavalli-Sforza LL (1995). Demographic history of India and 
mtDNA-sequence diversity. Am J Hum Genet 56:979-992. 

Mukerjee S, Mukherjee M, Ghosh T, Kalpana D, Sharma AK (2013). 
Differential pattern of genetic variability at the DXYS156 locus on 
homologous regions of X and Y chromosomes in Indian population and its 
forensic implications. Int J Legal Med 127(1):1-6. 

Mukherjee MB, Tripathy V, Colah RB, Solanki PK, Ghosh K, Reddy BM, 
Mohanty D (2009). Microsatellite diversity among the primitive tribes of 
India. Indian J Hum Genet 15(3):114-20. 

Mullis K, Faloona F, Scharf S, Saiki R, Horn G, Erlich H (1986). Specific 
enzymatic amplification of DNA in vitro: the polymerase chain 
reaction. Cold Spring Harb Symp Quant Biol 51(Pt 1):263-273. 

Mullis KB, Faloona FA (1987). Specific synthesis of DNA in vitro via a 
polymerase-catalyzed chain reaction. Methods Enzymol 155:335-350. 

Nair SP, Geetha A, Jagannath C (2011). Y-short tandem repeat haplotype 
and paternal lineage of the Ezhava population of Kerala, south India. 


Croat Med J 52(3):344-50. 


117 


Bibliography 


Nakamura Y, Lathrop M, O'Connell P, Leppert M, Lalouel JM, White R 
(1988). A primary map of ten DNA markers and two serological markers 
for human chromosome 19. Genomics 3(1):67-71 

Nakamura Y, Leppert M, O’Connell P, Wolf R, Holm T, Culver M, 
Martin C, Fujimoto E, Hoff M, Kumlin E, White R (1987). Variable 
number of tandem repeat (VNTR) markers for human gene mapping. 
Science 235:1616-1622. 

Nei M (1973). Analysis of gene diversity in subdivided populations. Proc 
Natl Acad Sci USA 70(12):3321-3323. 

Nei M (1987). Molecular Evolutionary Genetics. Columbia University 
Press, NY. 

Nei M, Kumar S (2000). Molecular Evolution and Phylogenetics. Oxford 
University Press, NY. 

Nei M, Roychoudhury AK (1974). Sampling variances of heterozygosity 
and genetic distance. Genetics 76(2):379-390. 

Nirenberg MW, et al (1966). The RNA code and protein synthesis. Cold 
Spring Harbor Symposia on Quantitative Biology 31:11-24. 

Nirenberg MW, Jones OW, Leder P, Clark BFC, Sly WS, Pestka S (1963). 
On the Coding of Genetic Information. Cold Spring Harb Symp Quant 
Biol 28:549-557. 

O'Connell P, Lathrop GM, Leppert M, Nakamura Y, Miiller U, Lalouel 
JM, White R (1988). Twelve loci form a continuous linkage map for 
human chromosome 18. Genomics 3(4):367-372. 

Odelberg SJ, Plaetke R, Eldridge JR, Ballard L, O'Connell P, Nakamura 
Y, Leppert M, Lalouel JM, White R (1989). Characterization of eight 
VNTR loci by agarose gel electrophoresis. Genomics 5(4):915-924. 


118 


Bibliography 


Painter T (1921). The Y-chromosome in mammals. Science 53(1378):503- 
504. 

Painter TS (1923) Studies in mammalian spermatogenesis II: The 
spermatogenesis of man. J Exp Zool 37:291-336. 

Palha T, Ribeiro-Rodrigues E, Ribeiro-dos-Santos A, Santos S (2012). 
Fourteen short tandem repeat loci Y chromosome haplotypes: Genetic 
analysis in populations from northern Brazil. Forensic Sci Int Genet 
6(3):413-8. 

Papiha SS (1996). Genetic variation in India. Hum Biol 68:607-628. 
Parvathy SN, Geetha A, Jagannath C (2012). Haplotype analysis of the 
polymorphic 17 YSTR markers in Kerala nontribal populations. Mol Biol 
Rep 39(6):7049-59. 

Pascali VL et al. (1998). Coordinating Y-chromosomal STR research for 
the Courts. Int J Legal Med 112:1. 

Pauling L, Corey RB (1953). A Proposed Structure for The Nucleic Acids. 
Proc Natl Acad Sci USA 39(2):84-97. 

Pauling L, Corey RB (1953). Structure of the nucleic acids. Nature 
171(4347): 346. 

Perveen R, Rahman Z, Shahzad MS, Israr M, Shafique M, Shan MA, Zar 
MS, Iqbal M, Husnain T (2014). Y-STR haplotype diversity in Punjabi 
population of Pakistan. Forensic Sci Int Genet 9:e20-21. 

Poznik GD, Henn BM, Yee M, Sliwerska E, Euskirchen GM, Lin AA, 
Snyder M, Murci LQ, Kidd JM, Underhill PA, Bustamante CD (2013). 
Sequencing Y Chromosomes Resolves Discrepancy in Time to Common 


Ancestor of Males Versus Females. Science 341(6145):562-565. 


119 


Bibliography 


Premi S, Srivastava J, Chandy SP, Ali S (2009). Unique signatures of 
natural background radiation on human Y chromosomes from Kerala, 
India. PLoS One 4(2):e4541. 

Primrose SB, Twyman RM (2003). Principles of Genome Analysis and 
Genomics, 3" edition. Blackwell Publishing Co., MA. 

Prinz M (2003). Advantages and disadvantages of Y-short tandem repeat 
testing in forensic casework. Forensic Sci Rev 15:189-196. 

Prinz M, Boll K, Baum H, Shaler B (1997). Multiplexing of Y 
chromosome specific STRs and performance for mixed samples. Forensic 
Sci Int 85(3):209-218. 

Pritchard JK, Seielstad MT, Perez-Lezaun A, Feldman MW (1999). 
Population growth of human Y chromosomes: a study of Y chromosome 
microsatellites. Mol Biol Evol 16(12):1791-1798. 

Puers C, Hammond HA, Jin L, Caskey CT, Schumm JW (1993). 
Identification of repeat sequence heterogeneity at the polymorphic short 
tandem repeat locus HUMTHO1 [AATG]n and reassignment of alleles in 
population analysis by using a locus-specific allelic ladder. Am J Hum 
Genet 53(4):953-958. 

Ramana GV, Su B, Jin L, Singh L, Wang N, Underhill P, Chakraborty R 
(2001). Y-chromosome SNP haplotypes suggest evidence of gene flow 
among caste, tribe, and the migrant Siddi populations of Andhra Pradesh, 
South India. Eur J Hum Genet 9(9):695-700. 

Redd AJ, Agellon AB, Kearney VA, Contreras VA, Karafet T, Park H, de 
Kniff P, Butler JM, Hammer MF (2002). Forensic value of 14 novel 
STRs on the human Y-chromosome. Forensic Sci Int 130(2-3):97-111. 
Redd AJ, Clifford SL, Stone-King M (1997). Multiplex DNA typing of 
short tandem repeat loci on the Y-chromosome. Biol Chem 378(8):923-7. 


120 


Bibliography 


Regueiro M, Rivera L, Chennakrishnaiah S, Popovic B, Andjus S, Milasin 
J, Herrera RJ (2012). Ancestral modal Y-STR haplotype shared among 
Romani and South Indian populations. Gene 504(2):296-30. 

Reich D, Thangaraj K, Patterson N, Price AL, Singh L (2009). 
Reconstructing Indian population history. Nature 461:489-494. 

Reynolds R, Sensabaugh G, Blake E (1991). Analysis of genetic markers 
in forensic DNA samples using the polymerase chain reaction. Anal Chem 
63(1):2-15. 

Rich A, Watson JD (1954). Physical studies on ribonucleic acid. Nature 
173(4412):995-6. 

Robertson J, Ziegle J, Kronick M, Madden D, Budowle B (1991). Genetic 
typing using automated electrophoresis and fluorescence detection. EXS 
58:391-398. 

Roewer L (2009). Y chromosome STR typing in crime casework. 
Forensic Sci Med Pathol 5:77-84. 

Roewer L, Arnemann J, Spurr NK, Grzeschik KH and Epplen JT (1992). 
Simple repeat sequences on the human Y chromosome are equally 
polymorphic as their autosomal counterparts. Hum Genet 89:389-394. 
Roewer L, Epplen JT (1992). Rapid and sensitive typing of forensic stains 
by PCR amplification of polymorphic simple repeat sequences in case 
work. Forensic Sci Int 53:163-171. 

Roewer L, Kayser M, Dieltjes P, Nagy M, Bakker E, Krawczak M, De 
Kniff P (1996). Analysis of molecular variance (AMOVA) of Y- 
chromosome-specific microsatellites in two closely related human 


populations. Hum Mol Genet 5:1029-33. 


1 All 


Bibliography 


Roewer L, Kayser M, Nagy M, de Knijff P (1996). Male identification 
using Y-chromosomal STR polymorphisms. Adv Forensic Haemogenet 
6:124-126. 

Roewer L, Krawezak M, Willuweit S, Nagy M, Alves C, Amorim A, 
Anslinger K, Augustin C, Betz A, Bosch E, Caglia A, Carracedo A, 
Corach D, Dekairelle A, Dobosz T, Dupuy BM, Furedi S, Gehrig C, 
Gusmao L, Henke J, Henke L, Hidding M, Hohoff C, Hoste B, Jobling 
MA, Kargel HJ, de Knijff P, Lessig R, Liebeherr E, Lorente M, Martinez- 
Jarreta B, Nievas P, Nowak M, Parson W, Pascali VL, Penacino G, Ploski 
R, Rolf B, Sala A, Schmidt U, Schmitt C, Schneider PM, Szibor R, Teifel- 
Greding J, Kayser M (2001). Online reference database of European Y- 
chromosomal short tandem repeat (STR) haplotypes. Forensic Sci Int 
118(2-3):106-13. 

Ruddle FH, Painter T (2004). First steps toward an understanding of the 
human genome. J Exp Zool 301(part A):375-377. 

Saey TH (2012). DNA hints at African cousin to humans. Science News 
182:9. 

Saha A, Udhayasuriyan PT, Bhat KV, Bamezai R (2003). Analysis of 
Indian population based on Y-STRs reveals existence of male gene flow 
across different language groups. DNA Cell Biol 22(11):707-19. 

Sahoo S, Singh A, Himabindu G, Banerjee J, Sitalaximi T, Gaikwad S, 
Trivedi R, Endicott P, Kivisild T, Metspalu M, Villems R, Kashyap VK 
(2006). A prehistory of Indian Y chromosomes: evaluating demic 
diffusion scenarios. Proc Natl Acad Sci USA 103(4):843-8. 

Saiki RK, Bugawan TL, Horn GT, Mullis KB, Erlich HA (1986). Analysis 
of enzymatically amplified beta-globin and HLA-DQ alpha DNA with 
allele-specific oligonucleotide probes. Nature 324(6093):163-166. 


122 


Bibliography 


Saiki RK, Gelfand DH, Stoffel S, Scharf SJ, Higuchi R, Horn GT, Mullis 
KB, Erlich HA (1988). Primer-directed enzymatic amplification of DNA 
with a thermostable DNA polymerase. Science 239(4839):487-91. 

Saiki RK, Scharf S, Faloona F, Mullis KB, Horn GT, Erlich HA, Arnheim 
N (1985). Enzymatic amplification of beta-globin genomic sequences and 
restriction site analysis for diagnosis of sickle cell anemia. Science 
230(4732):1350-1354. 

Sajantila A, Puomilahti S, Johnsson V, Ehnholm C (1992). Amplification 
of reproducible allele markers for amplified fragment length 
polymorphism analysis. Biotechniques 12(1):16-22. 

Sajantila A, Str6m M, Budowle B, Karhunen PJ, Peltonen L (1991). The 
polymerase chain reaction and post-mortem forensic identity testing: 
application of amplified D1IS80 and HLA-DQ alpha loci to the 
identification of fire victims. Forensic Sci Int 51(1):23-34. 

Sajantila A, Stro6m M, Budowle B, Tienari PJ, Ehnholm C, Peltonen L 
(1991). The distribution of the HLA-DQ alpha alleles and genotypes in the 
Finnish population as determined by the use of DNA amplification and 
allele specific oligonucleotides. Int J Legal Med 104(4):181-184. 
Sambrook J, Russell D (2001). Molecular cloning: a laboratory manual, 
3rd edition. Cold Spring Harbor Laboratory Press, NY. 

Sanger F, Nicklen S, Coulson AR (1977). DNA sequencing with chain- 
terminating inhibitors. Proc Natl Acad Sci USA 74(12):5463-5467. 

Saxena R, Brown LG, Hawkins T, Alagappan RK, Skaletsky H, Reeve 
MP, Reijo R, Rozen S, Dinulos MB, Disteche CM, Page DC (1996). The 
DAZ gene cluster on the human Y chromosome arose from an autosomal 
gene that was transposed, repeatedly amplified and pruned. Nat Genet 
14:292-299. 


123 


Bibliography 


Schiffner LA et al., (2005). Optimization of a simple, automatable 
extraction method to recover sufficient DNA from low copy number DNA 
samples for generation of short tandem repeat profiles. Croat Med J 
46:578-586. 

Schneider PM et al. (1998). Tandem repeat structure of the duplicated Y- 
chromosomal STR locus DYS385 and frequency studies in the German 
and three Asian populations. Forensic Sci Int 97:61-70. 

Schoske R, Vallone PM, Ruitberg CM, Butler JM (2003). Multiplex PCR 
design strategy used for the simultaneous amplification of 10 Y- 
chromosome short tandem repeat (STR) loci. Anal Bioanal Chem 
375(3):333-43. 

Sengupta S, Zhivotovsky LA, King R, Mehdi SQ, Edmonds CA, Chow 
CE, Lin AA, Mitra M, Sil SK, Ramesh A, et al. (2006). Polarity and 
temporality of high-resolution y-chromosome distributions in India 
identify both indigenous and exogenous expansions and reveal minor 
genetic influence of Central Asian pastoralists. Am J Hum Genet 78:202- 
ah, 

Shah AM, Tamang R, Moorjani P, Rani DS, Govindaraj P, Kulkarni G, 
Bhattacharya T, Mustak MS, Bhaskar LV, Reddy AG, et al. (2011). Indian 
Siddis African descendants with Indian admixture. Am J Hum Genet 
89:154-161. 

Sinha SK, Budowle B, Arcot SS, Richey SL, Chakrabor R, Jones MD, 
Wojtkiewicz PW, Schoenbauer DA, Gross AM, Sinha SK, Shewale JG 
(2003). Development and validation of a multiplexed Y-chromosome STR 
genotyping system, Y-PLEX 6, for forensic casework. J Forensic Sci 


48(1):93-103. 


124 


Bibliography 


Skaletsky H, Kuroda KT, Minx PJ, Cordum HS, Hillier L, Brown LG, 
Repping S, Pyntikova T, Ali J, Bieri T, Chinwalla A, Delehaunty A, 
Delehaunty K, Du H, Fewell G, Fulton L, Fulton R, Graves T, Hou SF, 
Latrielle P, Leonard S, Mardis E, Maupin R, McPherson J, Miner T, Nash 
W, Nguyen C, Ozersky P, Pepin K, Rock S, Rohlfing T, Scott K, Schultz 
B, Strong C, Wollam AT, Yang SP, Waterston RH, Wilson RK, Rozen S, 
Page DC (2003). The male-specific region of the human Y chromosome is 
a mosaic of discrete sequence classes. Nature 423:825-837. 

Speicher MR, Ballard SG, Ward DC (1996). Karyotyping human 
chromosomes by combinatorial multi-fluor FISH. Nat Genet 12,368-375. 
Spurdle AB, Jenkins T (1992). The Y chromosome as a tool for studying 
human evolution. Curr Opin Genet Dev 2(3):487-91. 

Steinman RM, Moberg CL (1994). A triple tribute to the experiment that 
transformed biology. J Exp Med 179:379-384. 

Stern C (1957). The problem of complete Y-linkage in man. Am J Hum 
Genet 9(3):147-166. 

Strachan T, Read AP (1999). Human Molecular Genetics, 2 edition. 
Wiley-Liss, NY. 

Sullivan KM, Pope S, Gill P, Robertson JM (1992). Automated DNA 
profiling by fluorescent labeling of PCR products. PCR Methods Appl 
2(1):34-40. 

Tamang R, Thangaraj K (2012). Genomic view on the peopling of India. 
Investig Genet 3(1):20. 

Tautz D (1989). Hypervariability of simple sequences as a general source 
for polymorphic DNA markers. Nucleic Acids Res 17:6463-6471. 

Tautz D, Renz M (1984). Simple sequences are ubiquitous repetitive 


components of eukaryotic genomes. Nucleic Acids Res 12:4127-4138. 


125 


Bibliography 


Thangaraj K, Chaubey G, Reddy AG, Singh VK and Singh L (2006). 
Unique origin of Andaman Islanders insight from autosomal loci. J Hum 
Genet 51:800-804. 

Thangaraj K, Chaubey G, Singh VK, Reddy AG, Chauhan P, Malvee R, 
Pavate PP, Singh L (2007). Y-chromosomal STR haplotypes in two 
endogamous tribal populations of Karnataka, India. J Forensic Sci 
52(3):751-3. 

Thangaraj K, Chaubey G, Singh VK, Vanniarajan A, Thanseem I, Reddy 
AG and Singh L (2006). In situ origin of deep rooting lineages of 
mitochondrial Macrohaplogroup 'M' in India. BMC Genomics 7:151. 
Thangaraj K, Naidu BP, Crivellaro F, Tamang R, Upadhyay S, Sharma 
VK, Reddy AG, Walimbe SR, Chaubey G, Kivisild T and Singh L (2010). 
The influence of natural barriers in shaping the genetic structure of 
Maharashtra populations. PLoS One 5:e15283. 

Thangaraj K, Nandan A, Sharma V, Sharma VK, Eaaswarkhanth M, Patra 
PK, Singh S, Rekha S, Dua M, Verma N et al. (2009) Deep rooting in-situ 
expansion of mtDNA Haplogroup R8 in South Asia. PLoS One 4:e6545. 
Thangaraj K, Ramana GV, Singh L (1999). Y-chromosome and 
mitochondrial DNA _ polymorphisms in Indian populations. 
Electrophoresis 20:1743-1747. 

Thangaraj K, Singh L, Reddy AG, Rao VR, Sehgal SC, Underhill PA, 
Pierson M, Frame IG, Hagelberg E (2003). Genetic affinities of the 
Andaman islanders, a vanishing human population. Curr Biol 13(2):86-93. 


Thanseem I, Thangaraj K, Chaubey G, Singh VK, Bhaskar LV, Reddy 
BM, Reddy AG, Singh L (2006). Genetic affinities among the lower castes 


126 


Bibliography 


and tribal groups of India: inference from Y chromosome and 
mitochondrial DNA. BMC Genet 7:42. 

Thompson T, Black S (2007). Forensic Human Identification: An 
Introduction. CRC Press, Boca Raton, FL. 

Tjio JH, Levan A _ (1956). The chromosome number of 
man, Hereditas 42:1-6. 

Trask BJ (2002). Human cytogenetics: 46 chromosomes, 46 years and 
counting. Nature Rev Genet 3:769-778. 

Underhill PA and Kivisild T (2007). Use of Y-chromosome and 
mitochondrial DNA _ population — structure in tracing human 
migrations. Annu Rev Genet 41:539-564. 

Underhill PA, Myres NM, Rootsi S, Metspalu M, Zhivotovsky LA, King 
RJ, Lin AA, Chow CE, Semino O, Battaglia V et al. (2010). Separating 
the post-Glacial coancestry of European and Asian Y chromosomes within 
haplogroup Rla. Eur J Hum Genet 18:479-484. 

Underhill PA, Shen P, Lin AA, Jin L, Passarino G, Yang WH, Kauffman 
E, Bonne-Tamir B, Bertranpetit J, Francalacci P, Ibrahim M, Jenkins T, 
Kidd JR, Mehdi SQ, Seielstad MT, Wells RS, Piazza A, Davis RW, 
Feldman MW, Cavalli-Sforza LL, Oefner PJ (2000). Y chromosome 
sequence variation and the history of human populations. Nat Genet 
26(3):358-361. 

Urquhart A, Kimpton CP, Doownes TJ, Gill P (1994). Variation in short 
tandem repeat sequences: a survey of twelve microsatellite loci for use as 
forensic identification markers. Int J Leg Med 107:13-20. 

Urquhart A, Kimpton CP, Gill P (1993). Sequence variability of the 
tetranucleotide repeat of the human beta-actin related pseudogene H-beta- 


Ac-psi-2 (ACTBP2) locus. Hum Genet 92:637-638. 


127 


Bibliography 


Van-Oorschot RAH, Ballantyne KN, Mitchell RJ (2010). Forensic trace 
DNA: a review. Investig Genet 1:14. 

Venter JC et al., (2001). The sequence of the human genome. Science 291 
(5507):1304-51. 

Vihko P, Mattila K, Ehnholm C (1981). Radioimmunoassay of human 
prostate-specific acid phosphatase. A sensitive and specific assay for 
semen detection in forensic medicine. Am J Clin Pathol 75(2):219-220. 
Voet D, Pratt CW, Voet JG (2012). Principles of Biochemistry, 4" edition. 
John Wiley & Sons, Inc., NY. 

Vuorio AF, Sajantila A, Hamdalainen T, Syvaénen AC, Ehnholm C, 
Peltonen L (1990). Amplification of the hypervariable region close to the 
apolipoprotein B gene: application to forensic problems. Biochem Biophys 
Res Commun 170(2):616-620. 

Walsh PS, Metzger DA, Higuchi R (1991). Chelex 100 as a medium for 
simple extraction of DNA for PCR-based typing from forensic 
material. Biotechniques 10(4):506-513. 

Wang CC, Jin L, Li H (2014). Natural Selection on Human Y 
Chromosomes. J Genet Genomics 41(2):47-52. 

Waterston RH, Lander ES, Sulston JE (2003). More on the sequencing of 
the human genome. Proc Nat Acad Sci USA 100(6):3022-3024. 

Watkins WS, Bamshad M, Dixon ME, Bhaskara Rao B, Naidu JM, Reddy 
PG, Prasad BV, Das PK, Reddy PC, Gai PB et al. (1999) Multiple origins 
of the mtDNA 9-bp deletion in populations of South India. Am J Phys 
Anthropol 109:147-158. 

Watson JD, Baker TA, Bell SP, Alexander G, Levine M, Losick R (2013). 
Molecular Biology the Gene, 7" Edition. Pearson Education, NJ. 


128 


Bibliography 


Watson JD, Crick FHC (1953). Genetical implications of the structure of 
Deoxyribonucleic Acid. Nature 171,964-967. 

Watson JD, Crick FHC (1953). Molecular Structure of Nucleic Acids: A 
Structure for Deoxyribose Nucleic Acids. Nature 171,737-738. 

Watson JD, Witkowski JA, Kobilinsky L, Liotti T, Oeser-Sweat JL 
(2004). DNA: Forensic and Legal Applications. Wiley-Blackwell 
Publisher, NY. 

Wayman E (2012). Africans’ genes mute on human birthplace. Science 
News 182:9. 

Weber JL, May PE (1989). Abundant class of human DNA 
polymorphisms which can be typed using the polymerase chain reaction. 
Am J Hum Genet 44:388-96. 

Weber K, Osborn M (1969). The reliability of molecular weight 
determinations by dodecyl sulfate-polyacrylamide gel electrophoresis. J 
Biol Chem 244:4406-12. 

Wei W, Ayub Q, Chen Y, McCarthy S, Hou Y, Carbone I, Xue Y, Tyler- 
Smith C (2013). A calibrated human Y-chromosomal phylogeny based on 
resequencing. Genome Res 23(2):388-95. 

Weir BS (2001). Handbook of Forensic Genetics. John Wiley and Sons, 
NY. 

Weir BS, Cockerham CC (1984). Estimating F-Statistics for the Analysis 
of Population Structure. Evolution 38:1358-1370. 

Weissenbach J, Gyapay G, Dib C, Vignal A, Morissette J, Millasseau P, 
Vaysseix G, Lathrop M (1992). A second-generation linkage map of the 
human genome. Nature 359(6398):794-801. 


129 


Bibliography 


White PS, Tatum OL, Deaven LL, Longmire JL (1999). New, male- 
specific microsatellite markers from the human Ychromosome. Genomics 
57(3):433-7. 

Whitfield LS, Sulston JE, Goodfellow PN (1995). Sequence variation of 
the human Y chromosome. Nature 378:379-380. 

Wilkins MHF, Gosling RG, Seeds WE (1951). Physical studies of nucleic 
acid. Nature 167(4254):759-60. 

Wilkins MHF, Stokes AR, Wilson HR (1953). Molecular Structure of 
Deoxypentose Nucleic Acids. Nature 171,738-740. 

Willuweit S, Roewer L (2007). Y chromosome haplotype reference 
database (YHRD): update. Forensic Sci Int Genet 1:83-87. 

Wolf G (2003). Friedrich Miescher: The man who discovered DNA. 
Chemical Heritage 21:10-11, 37-41. 

Wong Z, Wilson V, Jeffreys AJ, Thein SL (1986). Cloning a selected 
fragment from a human DNA 'fingerprint': isolation of an extremely 
polymorphic minisatellite. Nucleic Acids Res 14(11):4605-4616. 

Wong Z, Wilson V, Patel I, Povey S, Jeffreys AJ (1987). Characterization 
of a panel of highly variable minisatellites cloned from human DNA. Ann 
Hum Genet 51(Pt 4):269-288. 

Wright S (1951). The genetic structure of populations. Ann Eugen 15:323- 
354. 

Wurmb-Schwark N et al., (2006). Fast and simple DNA extraction from 
saliva and sperm cells obtained from the skin or isolated from swabs. 
Legal Med 8:177-181. 

Yadav B, Raina A, Dogra TD (2011). Haplotype diversity of 17 Y- 
chromosomal STRs in Saraswat Brahmin Community of North India. 


Forensic Sci Int Genet 5(3):e63-70. 


130 


Bibliography 


Yang YR, Jing YT, Zhang GD, Fang XD, Yan JW (2014). Genetic 
analysis of 17 Y-chromosomal STR loci of Chinese Tujia ethnic group 
residing in Youyang Region of Southern China. Leg Med (Tokyo) 
16(3):173-5. 

Y-Chromosome Consortium (2002). A nomenclature system for the tree of 
human Y-chromosomal binary haplogroups. Genome Res 12:339-348. 
Zerjal T, Pandya A, Thangaraj K, Ling EY, Kearley J, Bertoneri S, 
Paracchini S, Singh L, Tyler-Smith C (2007). Y-chromosomal insights 
into the genetic impact of the caste system in India. Hum Genet 


121(1):137-44. 


131 


1. 


Publications 


Nayak BP, Khajuria H, Gupta S (2013). Y-STR Polymorphism among 
Khandayat Community of Odisha, India. Res J Forensic Sci 1(3);5-6. 
Nayak BP, Khajuria H, Gupta S (2014). Genetic Analysis of Y- 
chromosomal STRs in Khandayat Population of Odisha, INDIA. Jnt 
Res J Biol Sci 3(7);26-28. 

Biswa Prakash Nayak, Himanshu Khajuria, Sapna Gupta (2014). Y- 
STR haplotype diversity among the Khandayat population of Odisha, 
India. Egypt J Forensic Sci doi:10.1016/j.ejfs.2014.07.003. 


132 


